[GH-ISSUE #6057] Ollama create from Model failed #3787

Closed
opened 2026-04-12 14:37:11 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @rentianxiang on GitHub (Jul 29, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6057

What is the issue?

I have downloaded llama3.1:70b model directly from llama.meta.com, and I am trying to import it into Ollama.
It stopped at processing tensors.
I have tried multiple times today, all failed at this stage.
Did I do anything wrong?
This is kinda related to another issue I have raised, since I am not able to download it from Ollama directly, I decided to download it first and then try to import it into Ollama
https://github.com/ollama/ollama/issues/5852

Commands:
PS D:\ollama> ollama create llama3.1:70b
transferring model data
unpacking model metadata
processing tensors

My Modelfile:
FROM D:\LLMs\Meta-Llama-3.1-70B-Instruct

The Blob files are generated
image

Model file also available then deleted
image

Server Log:
2024/07/29 22:48:44 routes.go:1099: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\Users\rtx\.ollama\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\Users\rtx\AppData\Local\Programs\Ollama\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-07-29T22:48:44.747+08:00 level=ERROR source=images.go:774 msg="couldn't remove blob" blob=1164794173 error="remove C:\Users\rtx\.ollama\models\blobs\1164794173: The directory is not empty."
time=2024-07-29T22:48:44.748+08:00 level=INFO source=images.go:784 msg="total blobs: 11"
time=2024-07-29T22:48:44.814+08:00 level=INFO source=images.go:791 msg="total unused blobs removed: 1"
time=2024-07-29T22:48:44.815+08:00 level=INFO source=routes.go:1146 msg="Listening on 127.0.0.1:11434 (version 0.3.0)"
time=2024-07-29T22:48:44.819+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v6.1 cpu cpu_avx cpu_avx2 cuda_v11.3]"
time=2024-07-29T22:48:44.819+08:00 level=INFO source=gpu.go:205 msg="looking for compatible GPUs"
time=2024-07-29T22:48:45.102+08:00 level=INFO source=types.go:105 msg="inference compute" id=GPU-e3ce22d3-ac09-e72f-5795-3c3f0a60b4d2 library=cuda compute=8.9 driver=12.5 name="NVIDIA GeForce RTX 4080 SUPER" total="16.0 GiB" available="14.7 GiB"
[GIN] 2024/07/29 - 22:54:14 | 200 | 0s | 127.0.0.1 | HEAD "/"
[GIN] 2024/07/29 - 22:54:14 | 200 | 1.6607ms | 127.0.0.1 | GET "/api/tags"
[GIN] 2024/07/29 - 22:56:15 | 200 | 0s | 127.0.0.1 | HEAD "/"
[GIN] 2024/07/29 - 22:56:15 | 404 | 503.3µs | 127.0.0.1 | POST "/api/show"
[GIN] 2024/07/29 - 22:56:29 | 200 | 0s | 127.0.0.1 | HEAD "/"
[GIN] 2024/07/29 - 22:56:29 | 404 | 0s | 127.0.0.1 | POST "/api/show"
[GIN] 2024/07/29 - 23:01:35 | 200 | 0s | 127.0.0.1 | HEAD "/"
[GIN] 2024/07/29 - 23:01:35 | 200 | 0s | 127.0.0.1 | GET "/api/ps"
[GIN] 2024/07/29 - 23:01:38 | 200 | 0s | 127.0.0.1 | HEAD "/"
[GIN] 2024/07/29 - 23:01:38 | 200 | 612.8µs | 127.0.0.1 | GET "/api/tags"
[GIN] 2024/07/29 - 23:01:43 | 200 | 0s | 127.0.0.1 | HEAD "/"
[GIN] 2024/07/29 - 23:01:43 | 200 | 18.7638ms | 127.0.0.1 | POST "/api/show"
time=2024-07-29T23:01:43.811+08:00 level=INFO source=sched.go:701 msg="new model will fit in available VRAM in single GPU, loading" model=C:\Users\rtx.ollama\models\blobs\sha256-87048bcd55216712ef14c11c2c303728463207b165bf18440b9b84b07ec00f87 gpu=GPU-e3ce22d3-ac09-e72f-5795-3c3f0a60b4d2 parallel=4 available=15753904128 required="6.2 GiB"
time=2024-07-29T23:01:43.811+08:00 level=INFO source=memory.go:309 msg="offload to cuda" layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[14.7 GiB]" memory.required.full="6.2 GiB" memory.required.partial="6.2 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[6.2 GiB]" memory.weights.total="4.7 GiB" memory.weights.repeating="4.3 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB"
time=2024-07-29T23:01:43.816+08:00 level=INFO source=server.go:383 msg="starting llama server" cmd="C:\Users\rtx\AppData\Local\Programs\Ollama\ollama_runners\cuda_v11.3\ollama_llama_server.exe --model C:\Users\rtx\.ollama\models\blobs\sha256-87048bcd55216712ef14c11c2c303728463207b165bf18440b9b84b07ec00f87 --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --no-mmap --parallel 4 --port 63814"
time=2024-07-29T23:01:43.820+08:00 level=INFO source=sched.go:437 msg="loaded runners" count=1
time=2024-07-29T23:01:43.820+08:00 level=INFO source=server.go:583 msg="waiting for llama runner to start responding"
time=2024-07-29T23:01:43.820+08:00 level=INFO source=server.go:617 msg="waiting for server to become available" status="llm server error"
INFO [wmain] build info | build=3440 commit="d94c6e0c" tid="21640" timestamp=1722265303
INFO [wmain] system info | n_threads=16 n_threads_batch=-1 system_info="AVX = 1 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 0 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 0 | " tid="21640" timestamp=1722265303 total_threads=32
INFO [wmain] HTTP server listening | hostname="127.0.0.1" n_threads_http="31" port="63814" tid="21640" timestamp=1722265303
llama_model_loader: loaded meta data with 29 key-value pairs and 291 tensors from C:\Users\rtx.ollama\models\blobs\sha256-87048bcd55216712ef14c11c2c303728463207b165bf18440b9b84b07ec00f87 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = llama
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Meta Llama 3.1 8B Instruct
llama_model_loader: - kv 3: general.finetune str = Instruct
llama_model_loader: - kv 4: general.basename str = Meta-Llama-3.1
llama_model_loader: - kv 5: general.size_label str = 8B
llama_model_loader: - kv 6: general.license str = llama3.1
llama_model_loader: - kv 7: general.tags arr[str,6] = ["facebook", "meta", "pytorch", "llam...
llama_model_loader: - kv 8: general.languages arr[str,8] = ["en", "de", "fr", "it", "pt", "hi", ...
llama_model_loader: - kv 9: llama.block_count u32 = 32
llama_model_loader: - kv 10: llama.context_length u32 = 131072
llama_model_loader: - kv 11: llama.embedding_length u32 = 4096
llama_model_loader: - kv 12: llama.feed_forward_length u32 = 14336
llama_model_loader: - kv 13: llama.attention.head_count u32 = 32
llama_model_loader: - kv 14: llama.attention.head_count_kv u32 = 8
llama_model_loader: - kv 15: llama.rope.freq_base f32 = 500000.000000
llama_model_loader: - kv 16: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 17: general.file_type u32 = 2
llama_model_loader: - kv 18: llama.vocab_size u32 = 128256
llama_model_loader: - kv 19: llama.rope.dimension_count u32 = 128
llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 21: tokenizer.ggml.pre str = llama-bpe
llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,128256] = ["!", """, "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv 25: tokenizer.ggml.bos_token_id u32 = 128000
llama_model_loader: - kv 26: tokenizer.ggml.eos_token_id u32 = 128009
llama_model_loader: - kv 27: tokenizer.chat_template str = {% set loop_messages = messages %}{% ...
llama_model_loader: - kv 28: general.quantization_version u32 = 2
llama_model_loader: - type f32: 65 tensors
llama_model_loader: - type q4_0: 225 tensors
llama_model_loader: - type q6_K: 1 tensors
time=2024-07-29T23:01:44.074+08:00 level=INFO source=server.go:617 msg="waiting for server to become available" status="llm server loading model"
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.7999 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = llama
llm_load_print_meta: vocab type = BPE
llm_load_print_meta: n_vocab = 128256
llm_load_print_meta: n_merges = 280147
llm_load_print_meta: vocab_only = 0
llm_load_print_meta: n_ctx_train = 131072
llm_load_print_meta: n_embd = 4096
llm_load_print_meta: n_layer = 32
llm_load_print_meta: n_head = 32
llm_load_print_meta: n_head_kv = 8
llm_load_print_meta: n_rot = 128
llm_load_print_meta: n_swa = 0
llm_load_print_meta: n_embd_head_k = 128
llm_load_print_meta: n_embd_head_v = 128
llm_load_print_meta: n_gqa = 4
llm_load_print_meta: n_embd_k_gqa = 1024
llm_load_print_meta: n_embd_v_gqa = 1024
llm_load_print_meta: f_norm_eps = 0.0e+00
llm_load_print_meta: f_norm_rms_eps = 1.0e-05
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale = 0.0e+00
llm_load_print_meta: n_ff = 14336
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: causal attn = 1
llm_load_print_meta: pooling type = 0
llm_load_print_meta: rope type = 0
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 500000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn = 131072
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: ssm_d_conv = 0
llm_load_print_meta: ssm_d_inner = 0
llm_load_print_meta: ssm_d_state = 0
llm_load_print_meta: ssm_dt_rank = 0
llm_load_print_meta: model type = 8B
llm_load_print_meta: model ftype = Q4_0
llm_load_print_meta: model params = 8.03 B
llm_load_print_meta: model size = 4.33 GiB (4.64 BPW)
llm_load_print_meta: general.name = Meta Llama 3.1 8B Instruct
llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token = 128009 '<|eot_id|>'
llm_load_print_meta: LF token = 128 'Ä'
llm_load_print_meta: EOT token = 128009 '<|eot_id|>'
llm_load_print_meta: max token length = 256
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 4080 SUPER, compute capability 8.9, VMM: yes
llm_load_tensors: ggml ctx size = 0.27 MiB
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading non-repeating layers to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors: CUDA_Host buffer size = 281.81 MiB
llm_load_tensors: CUDA0 buffer size = 4155.99 MiB
llama_new_context_with_model: n_ctx = 8192
llama_new_context_with_model: n_batch = 512
llama_new_context_with_model: n_ubatch = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base = 500000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: CUDA0 KV buffer size = 1024.00 MiB
llama_new_context_with_model: KV self size = 1024.00 MiB, K (f16): 512.00 MiB, V (f16): 512.00 MiB
llama_new_context_with_model: CUDA_Host output buffer size = 2.02 MiB
llama_new_context_with_model: CUDA0 compute buffer size = 560.00 MiB
llama_new_context_with_model: CUDA_Host compute buffer size = 24.01 MiB
llama_new_context_with_model: graph nodes = 1030
llama_new_context_with_model: graph splits = 2
INFO [wmain] model loaded | tid="21640" timestamp=1722265307
time=2024-07-29T23:01:47.646+08:00 level=INFO source=server.go:622 msg="llama runner started in 3.83 seconds"
[GIN] 2024/07/29 - 23:01:47 | 200 | 3.8840834s | 127.0.0.1 | POST "/api/chat"
[GIN] 2024/07/29 - 23:02:02 | 200 | 6.8410567s | 127.0.0.1 | POST "/api/chat"
[GIN] 2024/07/29 - 23:03:03 | 200 | 7.8408725s | 127.0.0.1 | POST "/api/chat"
[GIN] 2024/07/29 - 23:04:10 | 200 | 6.8992154s | 127.0.0.1 | POST "/api/chat"
[GIN] 2024/07/29 - 23:13:44 | 200 | 0s | 127.0.0.1 | HEAD "/"
[GIN] 2024/07/29 - 23:30:20 | 201 | 5m34s | 127.0.0.1 | POST "/api/blobs/sha256:e95f4b961ddd29031a98e9f84e2e2469d1005ee58d88dee7058cccf71a48cac5"
runtime: VirtualAlloc of 117440512 bytes failed with errno=1455
fatal error: out of memory

runtime stack:
runtime.throw({0x182ef79?, 0xe9f5baa000?})
runtime/panic.go:1023 +0x65 fp=0x346afffcf0 sp=0x346afffcc0 pc=0x85e9a5
runtime.sysUsedOS(0xe9f3e00000, 0x7000000)
runtime/mem_windows.go:83 +0x1bb fp=0x346afffd50 sp=0x346afffcf0 pc=0x83d21b
runtime.sysUsed(...)
runtime/mem.go:77
runtime.(*mheap).allocSpan(0x21267a0, 0x3800, 0x0, 0x1)
runtime/mheap.go:1347 +0x487 fp=0x346afffdf0 sp=0x346afffd50 pc=0x84f767
runtime.(*mheap).alloc.func1()
runtime/mheap.go:964 +0x5c fp=0x346afffe38 sp=0x346afffdf0 pc=0x84ef1c
runtime.systemstack(0xc000581180)
runtime/asm_amd64.s:509 +0x49 fp=0x346afffe48 sp=0x346afffe38 pc=0x8904a9

goroutine 195 gp=0xc000105a40 m=12 mp=0xc0003df008 [running]:
runtime.systemstack_switch()
runtime/asm_amd64.s:474 +0x8 fp=0xc00039a910 sp=0xc00039a900 pc=0x890448
runtime.(*mheap).alloc(0x7000000?, 0x3800?, 0xa0?)
runtime/mheap.go:958 +0x5b fp=0xc00039a958 sp=0xc00039a910 pc=0x84ee7b
runtime.(*mcache).allocLarge(0x83babd?, 0x7000000, 0x1)
runtime/mcache.go:234 +0x87 fp=0xc00039a9a8 sp=0xc00039a958 pc=0x83bfa7
runtime.mallocgc(0x7000000, 0x16cda80, 0x1)
runtime/malloc.go:1165 +0x597 fp=0xc00039aa30 sp=0xc00039a9a8 pc=0x832fb7
runtime.makeslice(0xc0003df008?, 0xd9cb6b6480?, 0x0?)
runtime/slice.go:107 +0x49 fp=0xc00039aa58 sp=0xc00039aa30 pc=0x874b89
github.com/nlpodyssey/gopickle/pytorch.(*BFloat16Storage).SetFromFileWithSize(0xd9cb6b6480, {0x155c6ed0048, 0xdb2c0282d0}, 0x1c00000)
github.com/nlpodyssey/gopickle@v0.3.0/pytorch/storage.go:395 +0x45 fp=0xc00039aad8 sp=0xc00039aa58 pc=0x117a285
github.com/nlpodyssey/gopickle/pytorch.loadTensor({0x155c69b6d20, 0x21a4860}, 0x1c00000, {0xd8a87779cb, 0x3}, {0xc88d0f733e, 0x2}, 0x91e219?)
github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:127 +0x209 fp=0xc00039ab88 sp=0xc00039aad8 pc=0x1176a69
github.com/nlpodyssey/gopickle/pytorch.loadZipFile.func1({0x1744640?, 0xc00036e768?})
github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:99 +0x32b fp=0xc00039ac50 sp=0xc00039ab88 pc=0x11761ab
github.com/nlpodyssey/gopickle/pickle.loadBinPersId(0xd8a87f4600)
github.com/nlpodyssey/gopickle@v0.3.0/pickle/pickle.go:439 +0x3e fp=0xc00039acb0 sp=0xc00039ac50 pc=0x116ebbe
github.com/nlpodyssey/gopickle/pickle.(*Unpickler).Load(0xd8a87f4600)
github.com/nlpodyssey/gopickle@v0.3.0/pickle/pickle.go:102 +0xe6 fp=0xc00039ad08 sp=0xc00039acb0 pc=0x116d1c6
github.com/nlpodyssey/gopickle/pytorch.loadZipFile({0xc0000c11a0, 0x40}, 0x18970a0)
github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:107 +0x5f9 fp=0xc00039ae68 sp=0xc00039ad08 pc=0x1175d39
github.com/nlpodyssey/gopickle/pytorch.LoadWithUnpickler({0xc0000c11a0, 0x40}, 0x18970a0)
github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:40 +0x3d fp=0xc00039ae90 sp=0xc00039ae68 pc=0x11756dd
github.com/nlpodyssey/gopickle/pytorch.Load({0xc0000c11a0?, 0xc00039b0f0?})
github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:31 +0x1f fp=0xc00039aeb8 sp=0xc00039ae90 pc=0x117565f
github.com/ollama/ollama/convert.(*TorchFormat).GetTensors(0x21a4860, {0xc0008584e0, 0x2c}, 0xc0005746e0)
github.com/ollama/ollama/convert/torch.go:46 +0x29f fp=0xc00039b180 sp=0xc00039aeb8 pc=0x118405f
github.com/ollama/ollama/convert.(*LlamaModel).GetTensors(0xc000854360)
github.com/ollama/ollama/convert/llama.go:24 +0x42 fp=0xc00039b2e0 sp=0xc00039b180 pc=0x117cd82
github.com/ollama/ollama/server.parseFromZipFile({0x19bf440?, 0xc0000ed2c0?}, 0xc0001600e0, {0xc000036871, 0x47}, 0xc000238230)
github.com/ollama/ollama/server/model.go:162 +0x229 fp=0xc00039b4e0 sp=0xc00039b2e0 pc=0x133ea29
github.com/ollama/ollama/server.parseFromFile({0x19c6e50, 0xc00011a3c0}, 0xc0001600e0, {0xc000036871, 0x47}, 0xc000238230)
github.com/ollama/ollama/server/model.go:222 +0x177 fp=0xc00039b5c0 sp=0xc00039b4e0 pc=0x133f737
github.com/ollama/ollama/server.CreateModel({0x19c6e50, 0xc00011a3c0}, {{0x18346cd, 0x12}, {0x1828b55, 0x7}, {0xc0002222a0, 0x8}, {0xc0002222a9, 0x3}}, ...)
github.com/ollama/ollama/server/images.go:418 +0x7dd fp=0xc00039be78 sp=0xc00039b5c0 pc=0x13343bd
github.com/ollama/ollama/server.(*Server).CreateModelHandler.func1()
github.com/ollama/ollama/server/routes.go:612 +0x26b fp=0xc00039bfe0 sp=0xc00039be78 pc=0x134942b
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc00039bfe8 sp=0xc00039bfe0 pc=0x892481
created by github.com/ollama/ollama/server.(*Server).CreateModelHandler in goroutine 111
github.com/ollama/ollama/server/routes.go:602 +0x9b9

goroutine 1 gp=0xc0000a2000 m=nil [IO wait, 20 minutes]:
runtime.gopark(0xc0000c3808?, 0x16365e0?, 0x20?, 0x99?, 0xc000499950?)
runtime/proc.go:402 +0xce fp=0xc0002d5670 sp=0xc0002d5650 pc=0x86176e
runtime.netpollblock(0x1c8?, 0x828e06?, 0x0?)
runtime/netpoll.go:573 +0xf7 fp=0xc0002d56a8 sp=0xc0002d5670 pc=0x859017
internal/poll.runtime_pollWait(0x155c6bb8820, 0x72)
runtime/netpoll.go:345 +0x85 fp=0xc0002d56c8 sp=0xc0002d56a8 pc=0x88c025
internal/poll.(*pollDesc).wait(0x840476?, 0x21a6820?, 0x0)
internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0002d56f0 sp=0xc0002d56c8 pc=0x930887
internal/poll.execIO(0xc000499920, 0xc0002d5790)
internal/poll/fd_windows.go:175 +0xe6 fp=0xc0002d5760 sp=0xc0002d56f0 pc=0x931d66
internal/poll.(*FD).acceptOne(0xc000499908, 0x440, {0xc0002b00f0?, 0x0?, 0x1500000000?}, 0xc0000c3808?)
internal/poll/fd_windows.go:944 +0x67 fp=0xc0002d57c0 sp=0xc0002d5760 pc=0x936427
internal/poll.(*FD).Accept(0xc000499908, 0xc0002d5970)
internal/poll/fd_windows.go:978 +0x1bc fp=0xc0002d5878 sp=0xc0002d57c0 pc=0x93675c
net.(*netFD).accept(0xc000499908)
net/fd_windows.go:178 +0x54 fp=0xc0002d5990 sp=0xc0002d5878 pc=0x9c8594
net.(*TCPListener).accept(0xc000543620)
net/tcpsock_posix.go:159 +0x1e fp=0xc0002d59b8 sp=0xc0002d5990 pc=0x9de95e
net.(*TCPListener).Accept(0xc000543620)
net/tcpsock.go:327 +0x30 fp=0xc0002d59e8 sp=0xc0002d59b8 pc=0x9dd750
net/http.(*onceCloseListener).Accept(0xc0001c5200?)
:1 +0x24 fp=0xc0002d5a00 sp=0xc0002d59e8 pc=0xb53924
net/http.(*Server).Serve(0xc00056a2d0, {0x19c4480, 0xc000543620})
net/http/server.go:3260 +0x33e fp=0xc0002d5b30 sp=0xc0002d5a00 pc=0xb312de
github.com/ollama/ollama/server.Serve({0x19c4480, 0xc000543620})
github.com/ollama/ollama/server/routes.go:1182 +0x7c5 fp=0xc0002d5cd0 sp=0xc0002d5b30 pc=0x13500c5
github.com/ollama/ollama/cmd.RunServer(0xc00004d500?, {0x21a4860?, 0x4?, 0x181f0ab?})
github.com/ollama/ollama/cmd/cmd.go:1084 +0x105 fp=0xc0002d5d58 sp=0xc0002d5cd0 pc=0x1373805
github.com/spf13/cobra.(*Command).execute(0xc000570908, {0x21a4860, 0x0, 0x0})
github.com/spf13/cobra@v1.7.0/command.go:940 +0x882 fp=0xc0002d5e78 sp=0xc0002d5d58 pc=0xbcd2e2
github.com/spf13/cobra.(*Command).ExecuteC(0xc000139808)
github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc0002d5f30 sp=0xc0002d5e78 pc=0xbcdb25
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
github.com/ollama/ollama/main.go:11 +0x4d fp=0xc0002d5f50 sp=0xc0002d5f30 pc=0x137c4cd
runtime.main()
runtime/proc.go:271 +0x28b fp=0xc0002d5fe0 sp=0xc0002d5f50 pc=0x86136b
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0002d5fe8 sp=0xc0002d5fe0 pc=0x892481

goroutine 2 gp=0xc0000a2700 m=nil [force gc (idle), 2 minutes]:
runtime.gopark(0x63eee3204744?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc0000a5fa8 sp=0xc0000a5f88 pc=0x86176e
runtime.goparkunlock(...)
runtime/proc.go:408
runtime.forcegchelper()
runtime/proc.go:326 +0xb8 fp=0xc0000a5fe0 sp=0xc0000a5fa8 pc=0x8615f8
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0000a5fe8 sp=0xc0000a5fe0 pc=0x892481
created by runtime.init.6 in goroutine 1
runtime/proc.go:314 +0x1a

goroutine 3 gp=0xc0000a2a80 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc0000a7f80 sp=0xc0000a7f60 pc=0x86176e
runtime.goparkunlock(...)
runtime/proc.go:408
runtime.bgsweep(0xc00003a070)
runtime/mgcsweep.go:318 +0xdf fp=0xc0000a7fc8 sp=0xc0000a7f80 pc=0x84b81f
runtime.gcenable.gowrap1()
runtime/mgc.go:203 +0x25 fp=0xc0000a7fe0 sp=0xc0000a7fc8 pc=0x8400c5
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0000a7fe8 sp=0xc0000a7fe0 pc=0x892481
created by runtime.gcenable in goroutine 1
runtime/mgc.go:203 +0x66

goroutine 4 gp=0xc0000a2c40 m=nil [GC scavenge wait]:
runtime.gopark(0xf89bc?, 0x96420?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc0000b7f78 sp=0xc0000b7f58 pc=0x86176e
runtime.goparkunlock(...)
runtime/proc.go:408
runtime.(*scavengerState).park(0x2118260)
runtime/mgcscavenge.go:425 +0x49 fp=0xc0000b7fa8 sp=0xc0000b7f78 pc=0x8491a9
runtime.bgscavenge(0xc00003a070)
runtime/mgcscavenge.go:658 +0x59 fp=0xc0000b7fc8 sp=0xc0000b7fa8 pc=0x849759
runtime.gcenable.gowrap2()
runtime/mgc.go:204 +0x25 fp=0xc0000b7fe0 sp=0xc0000b7fc8 pc=0x840065
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b7fe8 sp=0xc0000b7fe0 pc=0x892481
created by runtime.gcenable in goroutine 1
runtime/mgc.go:204 +0xa5

goroutine 5 gp=0xc0000a3180 m=nil [finalizer wait, 54 minutes]:
runtime.gopark(0xc0000a9e48?, 0x833465?, 0xa8?, 0x1?, 0xc0000a2000?)
runtime/proc.go:402 +0xce fp=0xc0000a9e20 sp=0xc0000a9e00 pc=0x86176e
runtime.runfinq()
runtime/mfinal.go:194 +0x107 fp=0xc0000a9fe0 sp=0xc0000a9e20 pc=0x83f147
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0000a9fe8 sp=0xc0000a9fe0 pc=0x892481
created by runtime.createfing in goroutine 1
runtime/mfinal.go:164 +0x3d

goroutine 6 gp=0xc00021cc40 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc0000b9f50 sp=0xc0000b9f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc0000b9fe0 sp=0xc0000b9f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b9fe8 sp=0xc0000b9fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 7 gp=0xc00021ce00 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc0000b3f50 sp=0xc0000b3f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc0000b3fe0 sp=0xc0000b3f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b3fe8 sp=0xc0000b3fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 18 gp=0xc000500000 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000507f50 sp=0xc000507f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000507fe0 sp=0xc000507f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000507fe8 sp=0xc000507fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 19 gp=0xc0005001c0 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000509f50 sp=0xc000509f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000509fe0 sp=0xc000509f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000509fe8 sp=0xc000509fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 34 gp=0xc0001041c0 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000503f50 sp=0xc000503f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000503fe0 sp=0xc000503f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000503fe8 sp=0xc000503fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 35 gp=0xc000104380 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000505f50 sp=0xc000505f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000505fe0 sp=0xc000505f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000505fe8 sp=0xc000505fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 8 gp=0xc00021cfc0 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc0000b5f50 sp=0xc0000b5f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc0000b5fe0 sp=0xc0000b5f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b5fe8 sp=0xc0000b5fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 20 gp=0xc000500380 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000513f50 sp=0xc000513f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000513fe0 sp=0xc000513f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000513fe8 sp=0xc000513fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 21 gp=0xc000500540 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000515f50 sp=0xc000515f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000515fe0 sp=0xc000515f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000515fe8 sp=0xc000515fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 9 gp=0xc00021d180 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc00050ff50 sp=0xc00050ff30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc00050ffe0 sp=0xc00050ff50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc00050ffe8 sp=0xc00050ffe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 10 gp=0xc00021d340 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000511f50 sp=0xc000511f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000511fe0 sp=0xc000511f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000511fe8 sp=0xc000511fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 11 gp=0xc00021d500 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000487f50 sp=0xc000487f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000487fe0 sp=0xc000487f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000487fe8 sp=0xc000487fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 36 gp=0xc000104540 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000483f50 sp=0xc000483f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000483fe0 sp=0xc000483f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000483fe8 sp=0xc000483fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 50 gp=0xc000580000 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000587f50 sp=0xc000587f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000587fe0 sp=0xc000587f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000587fe8 sp=0xc000587fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 37 gp=0xc000104700 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000485f50 sp=0xc000485f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000485fe0 sp=0xc000485f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000485fe8 sp=0xc000485fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 51 gp=0xc0005801c0 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000589f50 sp=0xc000589f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000589fe0 sp=0xc000589f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000589fe8 sp=0xc000589fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 12 gp=0xc00021d6c0 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000489f50 sp=0xc000489f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000489fe0 sp=0xc000489f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000489fe8 sp=0xc000489fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 13 gp=0xc00021d880 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000583f50 sp=0xc000583f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000583fe0 sp=0xc000583f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000583fe8 sp=0xc000583fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 22 gp=0xc000500700 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc00051bf50 sp=0xc00051bf30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc00051bfe0 sp=0xc00051bf50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc00051bfe8 sp=0xc00051bfe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 52 gp=0xc000580380 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000517f50 sp=0xc000517f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000517fe0 sp=0xc000517f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000517fe8 sp=0xc000517fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 53 gp=0xc000580540 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000519f50 sp=0xc000519f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000519fe0 sp=0xc000519f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000519fe8 sp=0xc000519fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 23 gp=0xc0005008c0 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc00051df50 sp=0xc00051df30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc00051dfe0 sp=0xc00051df50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc00051dfe8 sp=0xc00051dfe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 24 gp=0xc000500a80 m=nil [GC worker (idle), 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000523f50 sp=0xc000523f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000523fe0 sp=0xc000523f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000523fe8 sp=0xc000523fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 14 gp=0xc00021da40 m=nil [GC worker (idle), 2 minutes]:
runtime.gopark(0x21a6820?, 0x1?, 0xbc?, 0x1c?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000585f50 sp=0xc000585f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000585fe0 sp=0xc000585f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000585fe8 sp=0xc000585fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 25 gp=0xc000500c40 m=nil [GC worker (idle)]:
runtime.gopark(0x63eee4216ee8?, 0x1?, 0x9c?, 0x7a?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000525f50 sp=0xc000525f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000525fe0 sp=0xc000525f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000525fe8 sp=0xc000525fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 15 gp=0xc00021dc00 m=nil [GC worker (idle)]:
runtime.gopark(0x21a6820?, 0x1?, 0x40?, 0x43?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc00051ff50 sp=0xc00051ff30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc00051ffe0 sp=0xc00051ff50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc00051ffe8 sp=0xc00051ffe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 26 gp=0xc000500e00 m=nil [GC worker (idle), 7 minutes]:
runtime.gopark(0x638de805c1f0?, 0x1?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc00052bf50 sp=0xc00052bf30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc00052bfe0 sp=0xc00052bf50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc00052bfe8 sp=0xc00052bfe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 16 gp=0xc00021ddc0 m=nil [GC worker (idle)]:
runtime.gopark(0x21a6820?, 0x1?, 0x48?, 0xae?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000521f50 sp=0xc000521f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000521fe0 sp=0xc000521f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000521fe8 sp=0xc000521fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 27 gp=0xc000500fc0 m=nil [GC worker (idle)]:
runtime.gopark(0x21a6820?, 0x1?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc00052df50 sp=0xc00052df30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc00052dfe0 sp=0xc00052df50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc00052dfe8 sp=0xc00052dfe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 66 gp=0xc00048a000 m=nil [GC worker (idle)]:
runtime.gopark(0x21a6820?, 0x1?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000527f50 sp=0xc000527f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000527fe0 sp=0xc000527f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000527fe8 sp=0xc000527fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 28 gp=0xc000501180 m=nil [GC worker (idle)]:
runtime.gopark(0x63eee4216ee8?, 0x1?, 0xd0?, 0x43?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000535f50 sp=0xc000535f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000535fe0 sp=0xc000535f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000535fe8 sp=0xc000535fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 67 gp=0xc00048a1c0 m=nil [GC worker (idle)]:
runtime.gopark(0x21a6820?, 0x1?, 0x40?, 0x5?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000529f50 sp=0xc000529f30 pc=0x86176e
runtime.gcBgMarkWorker()
runtime/mgc.go:1310 +0xe5 fp=0xc000529fe0 sp=0xc000529f50 pc=0x842205
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000529fe8 sp=0xc000529fe0 pc=0x892481
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1234 +0x1c

goroutine 38 gp=0xc00048a380 m=8 mp=0xc000600008 [syscall, 54 minutes]:
runtime.notetsleepg(0x21a5460, 0xffffffffffffffff)
runtime/lock_sema.go:296 +0x31 fp=0xc000533fa0 sp=0xc000533f68 pc=0x831a31
os/signal.signal_recv()
runtime/sigqueue.go:152 +0x29 fp=0xc000533fc0 sp=0xc000533fa0 pc=0x88e189
os/signal.loop()
os/signal/signal_unix.go:23 +0x13 fp=0xc000533fe0 sp=0xc000533fc0 pc=0xb55d73
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000533fe8 sp=0xc000533fe0 pc=0x892481
created by os/signal.Notify.func1.1 in goroutine 1
os/signal/signal.go:151 +0x1f

goroutine 39 gp=0xc00048a540 m=nil [chan receive, 54 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:402 +0xce fp=0xc000537f00 sp=0xc000537ee0 pc=0x86176e
runtime.chanrecv(0xc0001725a0, 0x0, 0x1)
runtime/chan.go:583 +0x3cd fp=0xc000537f78 sp=0xc000537f00 pc=0x82b9ad
runtime.chanrecv1(0x0?, 0x0?)
runtime/chan.go:442 +0x12 fp=0xc000537fa0 sp=0xc000537f78 pc=0x82b5b2
github.com/ollama/ollama/server.Serve.func2()
github.com/ollama/ollama/server/routes.go:1163 +0x3d fp=0xc000537fe0 sp=0xc000537fa0 pc=0x13501dd
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000537fe8 sp=0xc000537fe0 pc=0x892481
created by github.com/ollama/ollama/server.Serve in goroutine 1
github.com/ollama/ollama/server/routes.go:1162 +0x72c

goroutine 40 gp=0xc00048a700 m=nil [select, 34 minutes]:
runtime.gopark(0xc0003abf50?, 0x3?, 0x40?, 0xac?, 0xc0003abd12?)
runtime/proc.go:402 +0xce fp=0xc0003abb98 sp=0xc0003abb78 pc=0x86176e
runtime.selectgo(0xc0003abf50, 0xc0003abd0c, 0x21a4860?, 0x0, 0x185739b?, 0x1)
runtime/select.go:327 +0x725 fp=0xc0003abcb8 sp=0xc0003abb98 pc=0x871bc5
github.com/ollama/ollama/server.(*Scheduler).processPending(0xc000172180, {0x19c6e50, 0xc0000be640})
github.com/ollama/ollama/server/sched.go:114 +0xcf fp=0xc0003abfb8 sp=0xc0003abcb8 pc=0x1352f4f
github.com/ollama/ollama/server.(*Scheduler).Run.func1()
github.com/ollama/ollama/server/sched.go:104 +0x1f fp=0xc0003abfe0 sp=0xc0003abfb8 pc=0x1352e5f
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0003abfe8 sp=0xc0003abfe0 pc=0x892481
created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1
github.com/ollama/ollama/server/sched.go:103 +0xb4

goroutine 41 gp=0xc00048a8c0 m=nil [select, 34 minutes]:
runtime.gopark(0xc000047f50?, 0x3?, 0x8?, 0x7c?, 0xc000047d52?)
runtime/proc.go:402 +0xce fp=0xc000531be0 sp=0xc000531bc0 pc=0x86176e
runtime.selectgo(0xc000531f50, 0xc000047d4c, 0x21a4860?, 0x0, 0x183d4a1?, 0x1)
runtime/select.go:327 +0x725 fp=0xc000531d00 sp=0xc000531be0 pc=0x871bc5
github.com/ollama/ollama/server.(*Scheduler).processCompleted(0xc000172180, {0x19c6e50, 0xc0000be640})
github.com/ollama/ollama/server/sched.go:303 +0xec fp=0xc000531fb8 sp=0xc000531d00 pc=0x135410c
github.com/ollama/ollama/server.(*Scheduler).Run.func2()
github.com/ollama/ollama/server/sched.go:108 +0x1f fp=0xc000531fe0 sp=0xc000531fb8 pc=0x1352e1f
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000531fe8 sp=0xc000531fe0 pc=0x892481
created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1
github.com/ollama/ollama/server/sched.go:107 +0x110

goroutine 111 gp=0xc000581340 m=nil [chan receive, 8 minutes]:
runtime.gopark(0x4?, 0xc0001c5200?, 0x1?, 0x0?, 0xc0000472f0?)
runtime/proc.go:402 +0xce fp=0xc000047290 sp=0xc000047270 pc=0x86176e
runtime.chanrecv(0xc0000c0480, 0xc000047388, 0x1)
runtime/chan.go:583 +0x3cd fp=0xc000047308 sp=0xc000047290 pc=0x82b9ad
runtime.chanrecv2(0xc000116040?, 0xc000380080?)
runtime/chan.go:447 +0x12 fp=0xc000047330 sp=0xc000047308 pc=0x82b5d2
github.com/ollama/ollama/server.streamResponse.func1({0x155c6da18e0, 0xc00010c000})
github.com/ollama/ollama/server/routes.go:1224 +0x36 fp=0xc0000473a8 sp=0xc000047330 pc=0x13507f6
github.com/gin-gonic/gin.(*Context).Stream(0xc000047428?, 0xc000047418)
github.com/gin-gonic/gin@v1.10.0/context.go:1124 +0x79 fp=0xc0000473f0 sp=0xc0000473a8 pc=0x130fa59
github.com/ollama/ollama/server.streamResponse(0xc00010c000, 0xc0000c0480)
github.com/ollama/ollama/server/routes.go:1223 +0x65 fp=0xc000047438 sp=0xc0000473f0 pc=0x1350785
github.com/ollama/ollama/server.(*Server).CreateModelHandler(0x18021a685d?, 0xc00010c000)
github.com/ollama/ollama/server/routes.go:624 +0xa25 fp=0xc000047660 sp=0xc000047438 pc=0x1348f45
github.com/ollama/ollama/server.(*Server).CreateModelHandler-fm(0x9?)
:1 +0x26 fp=0xc000047680 sp=0xc000047660 pc=0x1363386
github.com/gin-gonic/gin.(*Context).Next(0xc00010c000)
github.com/gin-gonic/gin@v1.10.0/context.go:185 +0x2b fp=0xc0000476a0 sp=0xc000047680 pc=0x1309c4b
github.com/ollama/ollama/server.(*Server).GenerateRoutes.allowedHostsMiddleware.func3(0xc00010c000)
github.com/ollama/ollama/server/routes.go:1022 +0x115 fp=0xc0000476f8 sp=0xc0000476a0 pc=0x134f855
github.com/gin-gonic/gin.(*Context).Next(...)
github.com/gin-gonic/gin@v1.10.0/context.go:185
github.com/gin-gonic/gin.CustomRecoveryWithWriter.func1(0xc00010c000)
github.com/gin-gonic/gin@v1.10.0/recovery.go:102 +0x7a fp=0xc000047748 sp=0xc0000476f8 pc=0x1317cba
github.com/gin-gonic/gin.(*Context).Next(...)
github.com/gin-gonic/gin@v1.10.0/context.go:185
github.com/gin-gonic/gin.LoggerWithConfig.func1(0xc00010c000)
github.com/gin-gonic/gin@v1.10.0/logger.go:249 +0xe5 fp=0xc000047900 sp=0xc000047748 pc=0x1316de5
github.com/gin-gonic/gin.(*Context).Next(...)
github.com/gin-gonic/gin@v1.10.0/context.go:185
github.com/gin-gonic/gin.(*Engine).handleHTTPRequest(0xc0001321a0, 0xc00010c000)
github.com/gin-gonic/gin@v1.10.0/gin.go:633 +0x892 fp=0xc000047ad8 sp=0xc000047900 pc=0x1316212
github.com/gin-gonic/gin.(*Engine).ServeHTTP(0xc0001321a0, {0x19c4690, 0xc0005560e0}, 0xc0004e6240)
github.com/gin-gonic/gin@v1.10.0/gin.go:589 +0x1b2 fp=0xc000047b10 sp=0xc000047ad8 pc=0x13157b2
net/http.(*ServeMux).ServeHTTP(0x833465?, {0x19c4690, 0xc0005560e0}, 0xc0004e6240)
net/http/server.go:2688 +0x1ad fp=0xc000047b60 sp=0xc000047b10 pc=0xb2f6ad
net/http.serverHandler.ServeHTTP({0x19c21d0?}, {0x19c4690?, 0xc0005560e0?}, 0x6?)
net/http/server.go:3142 +0x8e fp=0xc000047b90 sp=0xc000047b60 pc=0xb30eae
net/http.(*conn).serve(0xc0001c5200, {0x19c6e18, 0xc0000ece10})
net/http/server.go:2044 +0x5e8 fp=0xc000047fb8 sp=0xc000047b90 pc=0xb2c1a8
net/http.(*Server).Serve.gowrap3()
net/http/server.go:3290 +0x28 fp=0xc000047fe0 sp=0xc000047fb8 pc=0xb316c8
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc000047fe8 sp=0xc000047fe0 pc=0x892481
created by net/http.(*Server).Serve in goroutine 1
net/http/server.go:3290 +0x4b4

goroutine 194 gp=0xc000105180 m=nil [IO wait, 14 minutes]:
runtime.gopark(0x0?, 0xc000553920?, 0xd0?, 0x39?, 0xc000553950?)
runtime/proc.go:402 +0xce fp=0xc0006b5d28 sp=0xc0006b5d08 pc=0x86176e
runtime.netpollblock(0x538?, 0x828e06?, 0x0?)
runtime/netpoll.go:573 +0xf7 fp=0xc0006b5d60 sp=0xc0006b5d28 pc=0x859017
internal/poll.runtime_pollWait(0x155c6bb8728, 0x72)
runtime/netpoll.go:345 +0x85 fp=0xc0006b5d80 sp=0xc0006b5d60 pc=0x88c025
internal/poll.(*pollDesc).wait(0x10?, 0x10?, 0x0)
internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0006b5da8 sp=0xc0006b5d80 pc=0x930887
internal/poll.execIO(0xc000553920, 0x1896ac0)
internal/poll/fd_windows.go:175 +0xe6 fp=0xc0006b5e18 sp=0xc0006b5da8 pc=0x931d66
internal/poll.(*FD).Read(0xc000553908, {0xc000690041, 0x1, 0x1})
internal/poll/fd_windows.go:436 +0x2b1 fp=0xc0006b5ec0 sp=0xc0006b5e18 pc=0x932a11
net.(*netFD).Read(0xc000553908, {0xc000690041?, 0xc0006b5f48?, 0x88ded0?})
net/fd_posix.go:55 +0x25 fp=0xc0006b5f08 sp=0xc0006b5ec0 pc=0x9c66a5
net.(*conn).Read(0xc000160000, {0xc000690041?, 0xc0000bcd80?, 0x21a4860?})
net/net.go:185 +0x45 fp=0xc0006b5f50 sp=0xc0006b5f08 pc=0x9d6345
net.(*TCPConn).Read(0x160ee10?, {0xc000690041?, 0x0?, 0xc00063e3c0?})
:1 +0x25 fp=0xc0006b5f80 sp=0xc0006b5f50 pc=0x9e6665
net/http.(*connReader).backgroundRead(0xc000690030)
net/http/server.go:681 +0x37 fp=0xc0006b5fc8 sp=0xc0006b5f80 pc=0xb26117
net/http.(*connReader).startBackgroundRead.gowrap2()
net/http/server.go:677 +0x25 fp=0xc0006b5fe0 sp=0xc0006b5fc8 pc=0xb26045
runtime.goexit({})
runtime/asm_amd64.s:1695 +0x1 fp=0xc0006b5fe8 sp=0xc0006b5fe0 pc=0x892481
created by net/http.(*connReader).startBackgroundRead in goroutine 111
net/http/server.go:677 +0xba

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.3.0

Originally created by @rentianxiang on GitHub (Jul 29, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6057 ### What is the issue? I have downloaded llama3.1:70b model directly from llama.meta.com, and I am trying to import it into Ollama. It stopped at processing tensors. I have tried multiple times today, all failed at this stage. Did I do anything wrong? This is kinda related to another issue I have raised, since I am not able to download it from Ollama directly, I decided to download it first and then try to import it into Ollama https://github.com/ollama/ollama/issues/5852 **Commands:** PS D:\ollama> ollama create llama3.1:70b transferring model data unpacking model metadata processing tensors **My Modelfile:** FROM D:\LLMs\Meta-Llama-3.1-70B-Instruct The Blob files are generated ![image](https://github.com/user-attachments/assets/0ae25bb2-5d8b-430d-9b56-a75f4dff28b2) Model file also available then deleted ![image](https://github.com/user-attachments/assets/efb01c46-6753-4259-acbb-123d3a0bdcbb) **Server Log:** 2024/07/29 22:48:44 routes.go:1099: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\rtx\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\rtx\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-07-29T22:48:44.747+08:00 level=ERROR source=images.go:774 msg="couldn't remove blob" blob=1164794173 error="remove C:\\Users\\rtx\\.ollama\\models\\blobs\\1164794173: The directory is not empty." time=2024-07-29T22:48:44.748+08:00 level=INFO source=images.go:784 msg="total blobs: 11" time=2024-07-29T22:48:44.814+08:00 level=INFO source=images.go:791 msg="total unused blobs removed: 1" time=2024-07-29T22:48:44.815+08:00 level=INFO source=routes.go:1146 msg="Listening on 127.0.0.1:11434 (version 0.3.0)" time=2024-07-29T22:48:44.819+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v6.1 cpu cpu_avx cpu_avx2 cuda_v11.3]" time=2024-07-29T22:48:44.819+08:00 level=INFO source=gpu.go:205 msg="looking for compatible GPUs" time=2024-07-29T22:48:45.102+08:00 level=INFO source=types.go:105 msg="inference compute" id=GPU-e3ce22d3-ac09-e72f-5795-3c3f0a60b4d2 library=cuda compute=8.9 driver=12.5 name="NVIDIA GeForce RTX 4080 SUPER" total="16.0 GiB" available="14.7 GiB" [GIN] 2024/07/29 - 22:54:14 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2024/07/29 - 22:54:14 | 200 | 1.6607ms | 127.0.0.1 | GET "/api/tags" [GIN] 2024/07/29 - 22:56:15 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2024/07/29 - 22:56:15 | 404 | 503.3µs | 127.0.0.1 | POST "/api/show" [GIN] 2024/07/29 - 22:56:29 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2024/07/29 - 22:56:29 | 404 | 0s | 127.0.0.1 | POST "/api/show" [GIN] 2024/07/29 - 23:01:35 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2024/07/29 - 23:01:35 | 200 | 0s | 127.0.0.1 | GET "/api/ps" [GIN] 2024/07/29 - 23:01:38 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2024/07/29 - 23:01:38 | 200 | 612.8µs | 127.0.0.1 | GET "/api/tags" [GIN] 2024/07/29 - 23:01:43 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2024/07/29 - 23:01:43 | 200 | 18.7638ms | 127.0.0.1 | POST "/api/show" time=2024-07-29T23:01:43.811+08:00 level=INFO source=sched.go:701 msg="new model will fit in available VRAM in single GPU, loading" model=C:\Users\rtx\.ollama\models\blobs\sha256-87048bcd55216712ef14c11c2c303728463207b165bf18440b9b84b07ec00f87 gpu=GPU-e3ce22d3-ac09-e72f-5795-3c3f0a60b4d2 parallel=4 available=15753904128 required="6.2 GiB" time=2024-07-29T23:01:43.811+08:00 level=INFO source=memory.go:309 msg="offload to cuda" layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[14.7 GiB]" memory.required.full="6.2 GiB" memory.required.partial="6.2 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[6.2 GiB]" memory.weights.total="4.7 GiB" memory.weights.repeating="4.3 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB" time=2024-07-29T23:01:43.816+08:00 level=INFO source=server.go:383 msg="starting llama server" cmd="C:\\Users\\rtx\\AppData\\Local\\Programs\\Ollama\\ollama_runners\\cuda_v11.3\\ollama_llama_server.exe --model C:\\Users\\rtx\\.ollama\\models\\blobs\\sha256-87048bcd55216712ef14c11c2c303728463207b165bf18440b9b84b07ec00f87 --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --no-mmap --parallel 4 --port 63814" time=2024-07-29T23:01:43.820+08:00 level=INFO source=sched.go:437 msg="loaded runners" count=1 time=2024-07-29T23:01:43.820+08:00 level=INFO source=server.go:583 msg="waiting for llama runner to start responding" time=2024-07-29T23:01:43.820+08:00 level=INFO source=server.go:617 msg="waiting for server to become available" status="llm server error" INFO [wmain] build info | build=3440 commit="d94c6e0c" tid="21640" timestamp=1722265303 INFO [wmain] system info | n_threads=16 n_threads_batch=-1 system_info="AVX = 1 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 0 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 0 | " tid="21640" timestamp=1722265303 total_threads=32 INFO [wmain] HTTP server listening | hostname="127.0.0.1" n_threads_http="31" port="63814" tid="21640" timestamp=1722265303 llama_model_loader: loaded meta data with 29 key-value pairs and 291 tensors from C:\Users\rtx\.ollama\models\blobs\sha256-87048bcd55216712ef14c11c2c303728463207b165bf18440b9b84b07ec00f87 (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = Meta Llama 3.1 8B Instruct llama_model_loader: - kv 3: general.finetune str = Instruct llama_model_loader: - kv 4: general.basename str = Meta-Llama-3.1 llama_model_loader: - kv 5: general.size_label str = 8B llama_model_loader: - kv 6: general.license str = llama3.1 llama_model_loader: - kv 7: general.tags arr[str,6] = ["facebook", "meta", "pytorch", "llam... llama_model_loader: - kv 8: general.languages arr[str,8] = ["en", "de", "fr", "it", "pt", "hi", ... llama_model_loader: - kv 9: llama.block_count u32 = 32 llama_model_loader: - kv 10: llama.context_length u32 = 131072 llama_model_loader: - kv 11: llama.embedding_length u32 = 4096 llama_model_loader: - kv 12: llama.feed_forward_length u32 = 14336 llama_model_loader: - kv 13: llama.attention.head_count u32 = 32 llama_model_loader: - kv 14: llama.attention.head_count_kv u32 = 8 llama_model_loader: - kv 15: llama.rope.freq_base f32 = 500000.000000 llama_model_loader: - kv 16: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 17: general.file_type u32 = 2 llama_model_loader: - kv 18: llama.vocab_size u32 = 128256 llama_model_loader: - kv 19: llama.rope.dimension_count u32 = 128 llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 21: tokenizer.ggml.pre str = llama-bpe llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "... llama_model_loader: - kv 25: tokenizer.ggml.bos_token_id u32 = 128000 llama_model_loader: - kv 26: tokenizer.ggml.eos_token_id u32 = 128009 llama_model_loader: - kv 27: tokenizer.chat_template str = {% set loop_messages = messages %}{% ... llama_model_loader: - kv 28: general.quantization_version u32 = 2 llama_model_loader: - type f32: 65 tensors llama_model_loader: - type q4_0: 225 tensors llama_model_loader: - type q6_K: 1 tensors time=2024-07-29T23:01:44.074+08:00 level=INFO source=server.go:617 msg="waiting for server to become available" status="llm server loading model" llm_load_vocab: special tokens cache size = 256 llm_load_vocab: token to piece cache size = 0.7999 MB llm_load_print_meta: format = GGUF V3 (latest) llm_load_print_meta: arch = llama llm_load_print_meta: vocab type = BPE llm_load_print_meta: n_vocab = 128256 llm_load_print_meta: n_merges = 280147 llm_load_print_meta: vocab_only = 0 llm_load_print_meta: n_ctx_train = 131072 llm_load_print_meta: n_embd = 4096 llm_load_print_meta: n_layer = 32 llm_load_print_meta: n_head = 32 llm_load_print_meta: n_head_kv = 8 llm_load_print_meta: n_rot = 128 llm_load_print_meta: n_swa = 0 llm_load_print_meta: n_embd_head_k = 128 llm_load_print_meta: n_embd_head_v = 128 llm_load_print_meta: n_gqa = 4 llm_load_print_meta: n_embd_k_gqa = 1024 llm_load_print_meta: n_embd_v_gqa = 1024 llm_load_print_meta: f_norm_eps = 0.0e+00 llm_load_print_meta: f_norm_rms_eps = 1.0e-05 llm_load_print_meta: f_clamp_kqv = 0.0e+00 llm_load_print_meta: f_max_alibi_bias = 0.0e+00 llm_load_print_meta: f_logit_scale = 0.0e+00 llm_load_print_meta: n_ff = 14336 llm_load_print_meta: n_expert = 0 llm_load_print_meta: n_expert_used = 0 llm_load_print_meta: causal attn = 1 llm_load_print_meta: pooling type = 0 llm_load_print_meta: rope type = 0 llm_load_print_meta: rope scaling = linear llm_load_print_meta: freq_base_train = 500000.0 llm_load_print_meta: freq_scale_train = 1 llm_load_print_meta: n_ctx_orig_yarn = 131072 llm_load_print_meta: rope_finetuned = unknown llm_load_print_meta: ssm_d_conv = 0 llm_load_print_meta: ssm_d_inner = 0 llm_load_print_meta: ssm_d_state = 0 llm_load_print_meta: ssm_dt_rank = 0 llm_load_print_meta: model type = 8B llm_load_print_meta: model ftype = Q4_0 llm_load_print_meta: model params = 8.03 B llm_load_print_meta: model size = 4.33 GiB (4.64 BPW) llm_load_print_meta: general.name = Meta Llama 3.1 8B Instruct llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>' llm_load_print_meta: EOS token = 128009 '<|eot_id|>' llm_load_print_meta: LF token = 128 'Ä' llm_load_print_meta: EOT token = 128009 '<|eot_id|>' llm_load_print_meta: max token length = 256 ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 4080 SUPER, compute capability 8.9, VMM: yes llm_load_tensors: ggml ctx size = 0.27 MiB llm_load_tensors: offloading 32 repeating layers to GPU llm_load_tensors: offloading non-repeating layers to GPU llm_load_tensors: offloaded 33/33 layers to GPU llm_load_tensors: CUDA_Host buffer size = 281.81 MiB llm_load_tensors: CUDA0 buffer size = 4155.99 MiB llama_new_context_with_model: n_ctx = 8192 llama_new_context_with_model: n_batch = 512 llama_new_context_with_model: n_ubatch = 512 llama_new_context_with_model: flash_attn = 0 llama_new_context_with_model: freq_base = 500000.0 llama_new_context_with_model: freq_scale = 1 llama_kv_cache_init: CUDA0 KV buffer size = 1024.00 MiB llama_new_context_with_model: KV self size = 1024.00 MiB, K (f16): 512.00 MiB, V (f16): 512.00 MiB llama_new_context_with_model: CUDA_Host output buffer size = 2.02 MiB llama_new_context_with_model: CUDA0 compute buffer size = 560.00 MiB llama_new_context_with_model: CUDA_Host compute buffer size = 24.01 MiB llama_new_context_with_model: graph nodes = 1030 llama_new_context_with_model: graph splits = 2 INFO [wmain] model loaded | tid="21640" timestamp=1722265307 time=2024-07-29T23:01:47.646+08:00 level=INFO source=server.go:622 msg="llama runner started in 3.83 seconds" [GIN] 2024/07/29 - 23:01:47 | 200 | 3.8840834s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/07/29 - 23:02:02 | 200 | 6.8410567s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/07/29 - 23:03:03 | 200 | 7.8408725s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/07/29 - 23:04:10 | 200 | 6.8992154s | 127.0.0.1 | POST "/api/chat" [GIN] 2024/07/29 - 23:13:44 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2024/07/29 - 23:30:20 | 201 | 5m34s | 127.0.0.1 | POST "/api/blobs/sha256:e95f4b961ddd29031a98e9f84e2e2469d1005ee58d88dee7058cccf71a48cac5" runtime: VirtualAlloc of 117440512 bytes failed with errno=1455 fatal error: out of memory runtime stack: runtime.throw({0x182ef79?, 0xe9f5baa000?}) runtime/panic.go:1023 +0x65 fp=0x346afffcf0 sp=0x346afffcc0 pc=0x85e9a5 runtime.sysUsedOS(0xe9f3e00000, 0x7000000) runtime/mem_windows.go:83 +0x1bb fp=0x346afffd50 sp=0x346afffcf0 pc=0x83d21b runtime.sysUsed(...) runtime/mem.go:77 runtime.(*mheap).allocSpan(0x21267a0, 0x3800, 0x0, 0x1) runtime/mheap.go:1347 +0x487 fp=0x346afffdf0 sp=0x346afffd50 pc=0x84f767 runtime.(*mheap).alloc.func1() runtime/mheap.go:964 +0x5c fp=0x346afffe38 sp=0x346afffdf0 pc=0x84ef1c runtime.systemstack(0xc000581180) runtime/asm_amd64.s:509 +0x49 fp=0x346afffe48 sp=0x346afffe38 pc=0x8904a9 goroutine 195 gp=0xc000105a40 m=12 mp=0xc0003df008 [running]: runtime.systemstack_switch() runtime/asm_amd64.s:474 +0x8 fp=0xc00039a910 sp=0xc00039a900 pc=0x890448 runtime.(*mheap).alloc(0x7000000?, 0x3800?, 0xa0?) runtime/mheap.go:958 +0x5b fp=0xc00039a958 sp=0xc00039a910 pc=0x84ee7b runtime.(*mcache).allocLarge(0x83babd?, 0x7000000, 0x1) runtime/mcache.go:234 +0x87 fp=0xc00039a9a8 sp=0xc00039a958 pc=0x83bfa7 runtime.mallocgc(0x7000000, 0x16cda80, 0x1) runtime/malloc.go:1165 +0x597 fp=0xc00039aa30 sp=0xc00039a9a8 pc=0x832fb7 runtime.makeslice(0xc0003df008?, 0xd9cb6b6480?, 0x0?) runtime/slice.go:107 +0x49 fp=0xc00039aa58 sp=0xc00039aa30 pc=0x874b89 github.com/nlpodyssey/gopickle/pytorch.(*BFloat16Storage).SetFromFileWithSize(0xd9cb6b6480, {0x155c6ed0048, 0xdb2c0282d0}, 0x1c00000) github.com/nlpodyssey/gopickle@v0.3.0/pytorch/storage.go:395 +0x45 fp=0xc00039aad8 sp=0xc00039aa58 pc=0x117a285 github.com/nlpodyssey/gopickle/pytorch.loadTensor({0x155c69b6d20, 0x21a4860}, 0x1c00000, {0xd8a87779cb, 0x3}, {0xc88d0f733e, 0x2}, 0x91e219?) github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:127 +0x209 fp=0xc00039ab88 sp=0xc00039aad8 pc=0x1176a69 github.com/nlpodyssey/gopickle/pytorch.loadZipFile.func1({0x1744640?, 0xc00036e768?}) github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:99 +0x32b fp=0xc00039ac50 sp=0xc00039ab88 pc=0x11761ab github.com/nlpodyssey/gopickle/pickle.loadBinPersId(0xd8a87f4600) github.com/nlpodyssey/gopickle@v0.3.0/pickle/pickle.go:439 +0x3e fp=0xc00039acb0 sp=0xc00039ac50 pc=0x116ebbe github.com/nlpodyssey/gopickle/pickle.(*Unpickler).Load(0xd8a87f4600) github.com/nlpodyssey/gopickle@v0.3.0/pickle/pickle.go:102 +0xe6 fp=0xc00039ad08 sp=0xc00039acb0 pc=0x116d1c6 github.com/nlpodyssey/gopickle/pytorch.loadZipFile({0xc0000c11a0, 0x40}, 0x18970a0) github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:107 +0x5f9 fp=0xc00039ae68 sp=0xc00039ad08 pc=0x1175d39 github.com/nlpodyssey/gopickle/pytorch.LoadWithUnpickler({0xc0000c11a0, 0x40}, 0x18970a0) github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:40 +0x3d fp=0xc00039ae90 sp=0xc00039ae68 pc=0x11756dd github.com/nlpodyssey/gopickle/pytorch.Load({0xc0000c11a0?, 0xc00039b0f0?}) github.com/nlpodyssey/gopickle@v0.3.0/pytorch/pytorch.go:31 +0x1f fp=0xc00039aeb8 sp=0xc00039ae90 pc=0x117565f github.com/ollama/ollama/convert.(*TorchFormat).GetTensors(0x21a4860, {0xc0008584e0, 0x2c}, 0xc0005746e0) github.com/ollama/ollama/convert/torch.go:46 +0x29f fp=0xc00039b180 sp=0xc00039aeb8 pc=0x118405f github.com/ollama/ollama/convert.(*LlamaModel).GetTensors(0xc000854360) github.com/ollama/ollama/convert/llama.go:24 +0x42 fp=0xc00039b2e0 sp=0xc00039b180 pc=0x117cd82 github.com/ollama/ollama/server.parseFromZipFile({0x19bf440?, 0xc0000ed2c0?}, 0xc0001600e0, {0xc000036871, 0x47}, 0xc000238230) github.com/ollama/ollama/server/model.go:162 +0x229 fp=0xc00039b4e0 sp=0xc00039b2e0 pc=0x133ea29 github.com/ollama/ollama/server.parseFromFile({0x19c6e50, 0xc00011a3c0}, 0xc0001600e0, {0xc000036871, 0x47}, 0xc000238230) github.com/ollama/ollama/server/model.go:222 +0x177 fp=0xc00039b5c0 sp=0xc00039b4e0 pc=0x133f737 github.com/ollama/ollama/server.CreateModel({0x19c6e50, 0xc00011a3c0}, {{0x18346cd, 0x12}, {0x1828b55, 0x7}, {0xc0002222a0, 0x8}, {0xc0002222a9, 0x3}}, ...) github.com/ollama/ollama/server/images.go:418 +0x7dd fp=0xc00039be78 sp=0xc00039b5c0 pc=0x13343bd github.com/ollama/ollama/server.(*Server).CreateModelHandler.func1() github.com/ollama/ollama/server/routes.go:612 +0x26b fp=0xc00039bfe0 sp=0xc00039be78 pc=0x134942b runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc00039bfe8 sp=0xc00039bfe0 pc=0x892481 created by github.com/ollama/ollama/server.(*Server).CreateModelHandler in goroutine 111 github.com/ollama/ollama/server/routes.go:602 +0x9b9 goroutine 1 gp=0xc0000a2000 m=nil [IO wait, 20 minutes]: runtime.gopark(0xc0000c3808?, 0x16365e0?, 0x20?, 0x99?, 0xc000499950?) runtime/proc.go:402 +0xce fp=0xc0002d5670 sp=0xc0002d5650 pc=0x86176e runtime.netpollblock(0x1c8?, 0x828e06?, 0x0?) runtime/netpoll.go:573 +0xf7 fp=0xc0002d56a8 sp=0xc0002d5670 pc=0x859017 internal/poll.runtime_pollWait(0x155c6bb8820, 0x72) runtime/netpoll.go:345 +0x85 fp=0xc0002d56c8 sp=0xc0002d56a8 pc=0x88c025 internal/poll.(*pollDesc).wait(0x840476?, 0x21a6820?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0002d56f0 sp=0xc0002d56c8 pc=0x930887 internal/poll.execIO(0xc000499920, 0xc0002d5790) internal/poll/fd_windows.go:175 +0xe6 fp=0xc0002d5760 sp=0xc0002d56f0 pc=0x931d66 internal/poll.(*FD).acceptOne(0xc000499908, 0x440, {0xc0002b00f0?, 0x0?, 0x1500000000?}, 0xc0000c3808?) internal/poll/fd_windows.go:944 +0x67 fp=0xc0002d57c0 sp=0xc0002d5760 pc=0x936427 internal/poll.(*FD).Accept(0xc000499908, 0xc0002d5970) internal/poll/fd_windows.go:978 +0x1bc fp=0xc0002d5878 sp=0xc0002d57c0 pc=0x93675c net.(*netFD).accept(0xc000499908) net/fd_windows.go:178 +0x54 fp=0xc0002d5990 sp=0xc0002d5878 pc=0x9c8594 net.(*TCPListener).accept(0xc000543620) net/tcpsock_posix.go:159 +0x1e fp=0xc0002d59b8 sp=0xc0002d5990 pc=0x9de95e net.(*TCPListener).Accept(0xc000543620) net/tcpsock.go:327 +0x30 fp=0xc0002d59e8 sp=0xc0002d59b8 pc=0x9dd750 net/http.(*onceCloseListener).Accept(0xc0001c5200?) <autogenerated>:1 +0x24 fp=0xc0002d5a00 sp=0xc0002d59e8 pc=0xb53924 net/http.(*Server).Serve(0xc00056a2d0, {0x19c4480, 0xc000543620}) net/http/server.go:3260 +0x33e fp=0xc0002d5b30 sp=0xc0002d5a00 pc=0xb312de github.com/ollama/ollama/server.Serve({0x19c4480, 0xc000543620}) github.com/ollama/ollama/server/routes.go:1182 +0x7c5 fp=0xc0002d5cd0 sp=0xc0002d5b30 pc=0x13500c5 github.com/ollama/ollama/cmd.RunServer(0xc00004d500?, {0x21a4860?, 0x4?, 0x181f0ab?}) github.com/ollama/ollama/cmd/cmd.go:1084 +0x105 fp=0xc0002d5d58 sp=0xc0002d5cd0 pc=0x1373805 github.com/spf13/cobra.(*Command).execute(0xc000570908, {0x21a4860, 0x0, 0x0}) github.com/spf13/cobra@v1.7.0/command.go:940 +0x882 fp=0xc0002d5e78 sp=0xc0002d5d58 pc=0xbcd2e2 github.com/spf13/cobra.(*Command).ExecuteC(0xc000139808) github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc0002d5f30 sp=0xc0002d5e78 pc=0xbcdb25 github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) github.com/spf13/cobra@v1.7.0/command.go:985 main.main() github.com/ollama/ollama/main.go:11 +0x4d fp=0xc0002d5f50 sp=0xc0002d5f30 pc=0x137c4cd runtime.main() runtime/proc.go:271 +0x28b fp=0xc0002d5fe0 sp=0xc0002d5f50 pc=0x86136b runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0002d5fe8 sp=0xc0002d5fe0 pc=0x892481 goroutine 2 gp=0xc0000a2700 m=nil [force gc (idle), 2 minutes]: runtime.gopark(0x63eee3204744?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc0000a5fa8 sp=0xc0000a5f88 pc=0x86176e runtime.goparkunlock(...) runtime/proc.go:408 runtime.forcegchelper() runtime/proc.go:326 +0xb8 fp=0xc0000a5fe0 sp=0xc0000a5fa8 pc=0x8615f8 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0000a5fe8 sp=0xc0000a5fe0 pc=0x892481 created by runtime.init.6 in goroutine 1 runtime/proc.go:314 +0x1a goroutine 3 gp=0xc0000a2a80 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc0000a7f80 sp=0xc0000a7f60 pc=0x86176e runtime.goparkunlock(...) runtime/proc.go:408 runtime.bgsweep(0xc00003a070) runtime/mgcsweep.go:318 +0xdf fp=0xc0000a7fc8 sp=0xc0000a7f80 pc=0x84b81f runtime.gcenable.gowrap1() runtime/mgc.go:203 +0x25 fp=0xc0000a7fe0 sp=0xc0000a7fc8 pc=0x8400c5 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0000a7fe8 sp=0xc0000a7fe0 pc=0x892481 created by runtime.gcenable in goroutine 1 runtime/mgc.go:203 +0x66 goroutine 4 gp=0xc0000a2c40 m=nil [GC scavenge wait]: runtime.gopark(0xf89bc?, 0x96420?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc0000b7f78 sp=0xc0000b7f58 pc=0x86176e runtime.goparkunlock(...) runtime/proc.go:408 runtime.(*scavengerState).park(0x2118260) runtime/mgcscavenge.go:425 +0x49 fp=0xc0000b7fa8 sp=0xc0000b7f78 pc=0x8491a9 runtime.bgscavenge(0xc00003a070) runtime/mgcscavenge.go:658 +0x59 fp=0xc0000b7fc8 sp=0xc0000b7fa8 pc=0x849759 runtime.gcenable.gowrap2() runtime/mgc.go:204 +0x25 fp=0xc0000b7fe0 sp=0xc0000b7fc8 pc=0x840065 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b7fe8 sp=0xc0000b7fe0 pc=0x892481 created by runtime.gcenable in goroutine 1 runtime/mgc.go:204 +0xa5 goroutine 5 gp=0xc0000a3180 m=nil [finalizer wait, 54 minutes]: runtime.gopark(0xc0000a9e48?, 0x833465?, 0xa8?, 0x1?, 0xc0000a2000?) runtime/proc.go:402 +0xce fp=0xc0000a9e20 sp=0xc0000a9e00 pc=0x86176e runtime.runfinq() runtime/mfinal.go:194 +0x107 fp=0xc0000a9fe0 sp=0xc0000a9e20 pc=0x83f147 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0000a9fe8 sp=0xc0000a9fe0 pc=0x892481 created by runtime.createfing in goroutine 1 runtime/mfinal.go:164 +0x3d goroutine 6 gp=0xc00021cc40 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc0000b9f50 sp=0xc0000b9f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc0000b9fe0 sp=0xc0000b9f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b9fe8 sp=0xc0000b9fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 7 gp=0xc00021ce00 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc0000b3f50 sp=0xc0000b3f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc0000b3fe0 sp=0xc0000b3f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b3fe8 sp=0xc0000b3fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 18 gp=0xc000500000 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000507f50 sp=0xc000507f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000507fe0 sp=0xc000507f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000507fe8 sp=0xc000507fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 19 gp=0xc0005001c0 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000509f50 sp=0xc000509f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000509fe0 sp=0xc000509f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000509fe8 sp=0xc000509fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 34 gp=0xc0001041c0 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000503f50 sp=0xc000503f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000503fe0 sp=0xc000503f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000503fe8 sp=0xc000503fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 35 gp=0xc000104380 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000505f50 sp=0xc000505f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000505fe0 sp=0xc000505f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000505fe8 sp=0xc000505fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 8 gp=0xc00021cfc0 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc0000b5f50 sp=0xc0000b5f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc0000b5fe0 sp=0xc0000b5f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0000b5fe8 sp=0xc0000b5fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 20 gp=0xc000500380 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000513f50 sp=0xc000513f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000513fe0 sp=0xc000513f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000513fe8 sp=0xc000513fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 21 gp=0xc000500540 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000515f50 sp=0xc000515f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000515fe0 sp=0xc000515f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000515fe8 sp=0xc000515fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 9 gp=0xc00021d180 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc00050ff50 sp=0xc00050ff30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc00050ffe0 sp=0xc00050ff50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc00050ffe8 sp=0xc00050ffe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 10 gp=0xc00021d340 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000511f50 sp=0xc000511f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000511fe0 sp=0xc000511f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000511fe8 sp=0xc000511fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 11 gp=0xc00021d500 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000487f50 sp=0xc000487f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000487fe0 sp=0xc000487f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000487fe8 sp=0xc000487fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 36 gp=0xc000104540 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000483f50 sp=0xc000483f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000483fe0 sp=0xc000483f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000483fe8 sp=0xc000483fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 50 gp=0xc000580000 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000587f50 sp=0xc000587f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000587fe0 sp=0xc000587f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000587fe8 sp=0xc000587fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 37 gp=0xc000104700 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000485f50 sp=0xc000485f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000485fe0 sp=0xc000485f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000485fe8 sp=0xc000485fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 51 gp=0xc0005801c0 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000589f50 sp=0xc000589f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000589fe0 sp=0xc000589f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000589fe8 sp=0xc000589fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 12 gp=0xc00021d6c0 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000489f50 sp=0xc000489f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000489fe0 sp=0xc000489f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000489fe8 sp=0xc000489fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 13 gp=0xc00021d880 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000583f50 sp=0xc000583f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000583fe0 sp=0xc000583f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000583fe8 sp=0xc000583fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 22 gp=0xc000500700 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc00051bf50 sp=0xc00051bf30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc00051bfe0 sp=0xc00051bf50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc00051bfe8 sp=0xc00051bfe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 52 gp=0xc000580380 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000517f50 sp=0xc000517f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000517fe0 sp=0xc000517f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000517fe8 sp=0xc000517fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 53 gp=0xc000580540 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000519f50 sp=0xc000519f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000519fe0 sp=0xc000519f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000519fe8 sp=0xc000519fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 23 gp=0xc0005008c0 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc00051df50 sp=0xc00051df30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc00051dfe0 sp=0xc00051df50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc00051dfe8 sp=0xc00051dfe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 24 gp=0xc000500a80 m=nil [GC worker (idle), 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000523f50 sp=0xc000523f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000523fe0 sp=0xc000523f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000523fe8 sp=0xc000523fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 14 gp=0xc00021da40 m=nil [GC worker (idle), 2 minutes]: runtime.gopark(0x21a6820?, 0x1?, 0xbc?, 0x1c?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000585f50 sp=0xc000585f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000585fe0 sp=0xc000585f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000585fe8 sp=0xc000585fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 25 gp=0xc000500c40 m=nil [GC worker (idle)]: runtime.gopark(0x63eee4216ee8?, 0x1?, 0x9c?, 0x7a?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000525f50 sp=0xc000525f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000525fe0 sp=0xc000525f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000525fe8 sp=0xc000525fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 15 gp=0xc00021dc00 m=nil [GC worker (idle)]: runtime.gopark(0x21a6820?, 0x1?, 0x40?, 0x43?, 0x0?) runtime/proc.go:402 +0xce fp=0xc00051ff50 sp=0xc00051ff30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc00051ffe0 sp=0xc00051ff50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc00051ffe8 sp=0xc00051ffe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 26 gp=0xc000500e00 m=nil [GC worker (idle), 7 minutes]: runtime.gopark(0x638de805c1f0?, 0x1?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc00052bf50 sp=0xc00052bf30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc00052bfe0 sp=0xc00052bf50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc00052bfe8 sp=0xc00052bfe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 16 gp=0xc00021ddc0 m=nil [GC worker (idle)]: runtime.gopark(0x21a6820?, 0x1?, 0x48?, 0xae?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000521f50 sp=0xc000521f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000521fe0 sp=0xc000521f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000521fe8 sp=0xc000521fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 27 gp=0xc000500fc0 m=nil [GC worker (idle)]: runtime.gopark(0x21a6820?, 0x1?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc00052df50 sp=0xc00052df30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc00052dfe0 sp=0xc00052df50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc00052dfe8 sp=0xc00052dfe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 66 gp=0xc00048a000 m=nil [GC worker (idle)]: runtime.gopark(0x21a6820?, 0x1?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000527f50 sp=0xc000527f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000527fe0 sp=0xc000527f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000527fe8 sp=0xc000527fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 28 gp=0xc000501180 m=nil [GC worker (idle)]: runtime.gopark(0x63eee4216ee8?, 0x1?, 0xd0?, 0x43?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000535f50 sp=0xc000535f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000535fe0 sp=0xc000535f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000535fe8 sp=0xc000535fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 67 gp=0xc00048a1c0 m=nil [GC worker (idle)]: runtime.gopark(0x21a6820?, 0x1?, 0x40?, 0x5?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000529f50 sp=0xc000529f30 pc=0x86176e runtime.gcBgMarkWorker() runtime/mgc.go:1310 +0xe5 fp=0xc000529fe0 sp=0xc000529f50 pc=0x842205 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000529fe8 sp=0xc000529fe0 pc=0x892481 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1234 +0x1c goroutine 38 gp=0xc00048a380 m=8 mp=0xc000600008 [syscall, 54 minutes]: runtime.notetsleepg(0x21a5460, 0xffffffffffffffff) runtime/lock_sema.go:296 +0x31 fp=0xc000533fa0 sp=0xc000533f68 pc=0x831a31 os/signal.signal_recv() runtime/sigqueue.go:152 +0x29 fp=0xc000533fc0 sp=0xc000533fa0 pc=0x88e189 os/signal.loop() os/signal/signal_unix.go:23 +0x13 fp=0xc000533fe0 sp=0xc000533fc0 pc=0xb55d73 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000533fe8 sp=0xc000533fe0 pc=0x892481 created by os/signal.Notify.func1.1 in goroutine 1 os/signal/signal.go:151 +0x1f goroutine 39 gp=0xc00048a540 m=nil [chan receive, 54 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:402 +0xce fp=0xc000537f00 sp=0xc000537ee0 pc=0x86176e runtime.chanrecv(0xc0001725a0, 0x0, 0x1) runtime/chan.go:583 +0x3cd fp=0xc000537f78 sp=0xc000537f00 pc=0x82b9ad runtime.chanrecv1(0x0?, 0x0?) runtime/chan.go:442 +0x12 fp=0xc000537fa0 sp=0xc000537f78 pc=0x82b5b2 github.com/ollama/ollama/server.Serve.func2() github.com/ollama/ollama/server/routes.go:1163 +0x3d fp=0xc000537fe0 sp=0xc000537fa0 pc=0x13501dd runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000537fe8 sp=0xc000537fe0 pc=0x892481 created by github.com/ollama/ollama/server.Serve in goroutine 1 github.com/ollama/ollama/server/routes.go:1162 +0x72c goroutine 40 gp=0xc00048a700 m=nil [select, 34 minutes]: runtime.gopark(0xc0003abf50?, 0x3?, 0x40?, 0xac?, 0xc0003abd12?) runtime/proc.go:402 +0xce fp=0xc0003abb98 sp=0xc0003abb78 pc=0x86176e runtime.selectgo(0xc0003abf50, 0xc0003abd0c, 0x21a4860?, 0x0, 0x185739b?, 0x1) runtime/select.go:327 +0x725 fp=0xc0003abcb8 sp=0xc0003abb98 pc=0x871bc5 github.com/ollama/ollama/server.(*Scheduler).processPending(0xc000172180, {0x19c6e50, 0xc0000be640}) github.com/ollama/ollama/server/sched.go:114 +0xcf fp=0xc0003abfb8 sp=0xc0003abcb8 pc=0x1352f4f github.com/ollama/ollama/server.(*Scheduler).Run.func1() github.com/ollama/ollama/server/sched.go:104 +0x1f fp=0xc0003abfe0 sp=0xc0003abfb8 pc=0x1352e5f runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0003abfe8 sp=0xc0003abfe0 pc=0x892481 created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1 github.com/ollama/ollama/server/sched.go:103 +0xb4 goroutine 41 gp=0xc00048a8c0 m=nil [select, 34 minutes]: runtime.gopark(0xc000047f50?, 0x3?, 0x8?, 0x7c?, 0xc000047d52?) runtime/proc.go:402 +0xce fp=0xc000531be0 sp=0xc000531bc0 pc=0x86176e runtime.selectgo(0xc000531f50, 0xc000047d4c, 0x21a4860?, 0x0, 0x183d4a1?, 0x1) runtime/select.go:327 +0x725 fp=0xc000531d00 sp=0xc000531be0 pc=0x871bc5 github.com/ollama/ollama/server.(*Scheduler).processCompleted(0xc000172180, {0x19c6e50, 0xc0000be640}) github.com/ollama/ollama/server/sched.go:303 +0xec fp=0xc000531fb8 sp=0xc000531d00 pc=0x135410c github.com/ollama/ollama/server.(*Scheduler).Run.func2() github.com/ollama/ollama/server/sched.go:108 +0x1f fp=0xc000531fe0 sp=0xc000531fb8 pc=0x1352e1f runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000531fe8 sp=0xc000531fe0 pc=0x892481 created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1 github.com/ollama/ollama/server/sched.go:107 +0x110 goroutine 111 gp=0xc000581340 m=nil [chan receive, 8 minutes]: runtime.gopark(0x4?, 0xc0001c5200?, 0x1?, 0x0?, 0xc0000472f0?) runtime/proc.go:402 +0xce fp=0xc000047290 sp=0xc000047270 pc=0x86176e runtime.chanrecv(0xc0000c0480, 0xc000047388, 0x1) runtime/chan.go:583 +0x3cd fp=0xc000047308 sp=0xc000047290 pc=0x82b9ad runtime.chanrecv2(0xc000116040?, 0xc000380080?) runtime/chan.go:447 +0x12 fp=0xc000047330 sp=0xc000047308 pc=0x82b5d2 github.com/ollama/ollama/server.streamResponse.func1({0x155c6da18e0, 0xc00010c000}) github.com/ollama/ollama/server/routes.go:1224 +0x36 fp=0xc0000473a8 sp=0xc000047330 pc=0x13507f6 github.com/gin-gonic/gin.(*Context).Stream(0xc000047428?, 0xc000047418) github.com/gin-gonic/gin@v1.10.0/context.go:1124 +0x79 fp=0xc0000473f0 sp=0xc0000473a8 pc=0x130fa59 github.com/ollama/ollama/server.streamResponse(0xc00010c000, 0xc0000c0480) github.com/ollama/ollama/server/routes.go:1223 +0x65 fp=0xc000047438 sp=0xc0000473f0 pc=0x1350785 github.com/ollama/ollama/server.(*Server).CreateModelHandler(0x18021a685d?, 0xc00010c000) github.com/ollama/ollama/server/routes.go:624 +0xa25 fp=0xc000047660 sp=0xc000047438 pc=0x1348f45 github.com/ollama/ollama/server.(*Server).CreateModelHandler-fm(0x9?) <autogenerated>:1 +0x26 fp=0xc000047680 sp=0xc000047660 pc=0x1363386 github.com/gin-gonic/gin.(*Context).Next(0xc00010c000) github.com/gin-gonic/gin@v1.10.0/context.go:185 +0x2b fp=0xc0000476a0 sp=0xc000047680 pc=0x1309c4b github.com/ollama/ollama/server.(*Server).GenerateRoutes.allowedHostsMiddleware.func3(0xc00010c000) github.com/ollama/ollama/server/routes.go:1022 +0x115 fp=0xc0000476f8 sp=0xc0000476a0 pc=0x134f855 github.com/gin-gonic/gin.(*Context).Next(...) github.com/gin-gonic/gin@v1.10.0/context.go:185 github.com/gin-gonic/gin.CustomRecoveryWithWriter.func1(0xc00010c000) github.com/gin-gonic/gin@v1.10.0/recovery.go:102 +0x7a fp=0xc000047748 sp=0xc0000476f8 pc=0x1317cba github.com/gin-gonic/gin.(*Context).Next(...) github.com/gin-gonic/gin@v1.10.0/context.go:185 github.com/gin-gonic/gin.LoggerWithConfig.func1(0xc00010c000) github.com/gin-gonic/gin@v1.10.0/logger.go:249 +0xe5 fp=0xc000047900 sp=0xc000047748 pc=0x1316de5 github.com/gin-gonic/gin.(*Context).Next(...) github.com/gin-gonic/gin@v1.10.0/context.go:185 github.com/gin-gonic/gin.(*Engine).handleHTTPRequest(0xc0001321a0, 0xc00010c000) github.com/gin-gonic/gin@v1.10.0/gin.go:633 +0x892 fp=0xc000047ad8 sp=0xc000047900 pc=0x1316212 github.com/gin-gonic/gin.(*Engine).ServeHTTP(0xc0001321a0, {0x19c4690, 0xc0005560e0}, 0xc0004e6240) github.com/gin-gonic/gin@v1.10.0/gin.go:589 +0x1b2 fp=0xc000047b10 sp=0xc000047ad8 pc=0x13157b2 net/http.(*ServeMux).ServeHTTP(0x833465?, {0x19c4690, 0xc0005560e0}, 0xc0004e6240) net/http/server.go:2688 +0x1ad fp=0xc000047b60 sp=0xc000047b10 pc=0xb2f6ad net/http.serverHandler.ServeHTTP({0x19c21d0?}, {0x19c4690?, 0xc0005560e0?}, 0x6?) net/http/server.go:3142 +0x8e fp=0xc000047b90 sp=0xc000047b60 pc=0xb30eae net/http.(*conn).serve(0xc0001c5200, {0x19c6e18, 0xc0000ece10}) net/http/server.go:2044 +0x5e8 fp=0xc000047fb8 sp=0xc000047b90 pc=0xb2c1a8 net/http.(*Server).Serve.gowrap3() net/http/server.go:3290 +0x28 fp=0xc000047fe0 sp=0xc000047fb8 pc=0xb316c8 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc000047fe8 sp=0xc000047fe0 pc=0x892481 created by net/http.(*Server).Serve in goroutine 1 net/http/server.go:3290 +0x4b4 goroutine 194 gp=0xc000105180 m=nil [IO wait, 14 minutes]: runtime.gopark(0x0?, 0xc000553920?, 0xd0?, 0x39?, 0xc000553950?) runtime/proc.go:402 +0xce fp=0xc0006b5d28 sp=0xc0006b5d08 pc=0x86176e runtime.netpollblock(0x538?, 0x828e06?, 0x0?) runtime/netpoll.go:573 +0xf7 fp=0xc0006b5d60 sp=0xc0006b5d28 pc=0x859017 internal/poll.runtime_pollWait(0x155c6bb8728, 0x72) runtime/netpoll.go:345 +0x85 fp=0xc0006b5d80 sp=0xc0006b5d60 pc=0x88c025 internal/poll.(*pollDesc).wait(0x10?, 0x10?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0006b5da8 sp=0xc0006b5d80 pc=0x930887 internal/poll.execIO(0xc000553920, 0x1896ac0) internal/poll/fd_windows.go:175 +0xe6 fp=0xc0006b5e18 sp=0xc0006b5da8 pc=0x931d66 internal/poll.(*FD).Read(0xc000553908, {0xc000690041, 0x1, 0x1}) internal/poll/fd_windows.go:436 +0x2b1 fp=0xc0006b5ec0 sp=0xc0006b5e18 pc=0x932a11 net.(*netFD).Read(0xc000553908, {0xc000690041?, 0xc0006b5f48?, 0x88ded0?}) net/fd_posix.go:55 +0x25 fp=0xc0006b5f08 sp=0xc0006b5ec0 pc=0x9c66a5 net.(*conn).Read(0xc000160000, {0xc000690041?, 0xc0000bcd80?, 0x21a4860?}) net/net.go:185 +0x45 fp=0xc0006b5f50 sp=0xc0006b5f08 pc=0x9d6345 net.(*TCPConn).Read(0x160ee10?, {0xc000690041?, 0x0?, 0xc00063e3c0?}) <autogenerated>:1 +0x25 fp=0xc0006b5f80 sp=0xc0006b5f50 pc=0x9e6665 net/http.(*connReader).backgroundRead(0xc000690030) net/http/server.go:681 +0x37 fp=0xc0006b5fc8 sp=0xc0006b5f80 pc=0xb26117 net/http.(*connReader).startBackgroundRead.gowrap2() net/http/server.go:677 +0x25 fp=0xc0006b5fe0 sp=0xc0006b5fc8 pc=0xb26045 runtime.goexit({}) runtime/asm_amd64.s:1695 +0x1 fp=0xc0006b5fe8 sp=0xc0006b5fe0 pc=0x892481 created by net/http.(*connReader).startBackgroundRead in goroutine 111 net/http/server.go:677 +0xba ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.0
GiteaMirror added the bug label 2026-04-12 14:37:11 -05:00
Author
Owner

@rick-github commented on GitHub (Jul 29, 2024):

runtime: VirtualAlloc of 117440512 bytes failed with errno=1455
fatal error: out of memory

At a guess I'd say you are trying to create a model on a machine that doesn't have the resources to do so. There are easier ways to do it, download the quantized model from huggingface and use that instead:

https://huggingface.co/bartowski/Meta-Llama-3.1-70B-Instruct-GGUF

<!-- gh-comment-id:2256351639 --> @rick-github commented on GitHub (Jul 29, 2024): ``` runtime: VirtualAlloc of 117440512 bytes failed with errno=1455 fatal error: out of memory ``` At a guess I'd say you are trying to create a model on a machine that doesn't have the resources to do so. There are easier ways to do it, download the quantized model from huggingface and use that instead: https://huggingface.co/bartowski/Meta-Llama-3.1-70B-Instruct-GGUF
Author
Owner

@pdevine commented on GitHub (Sep 2, 2024):

I'm going to go ahead and close the issue.

<!-- gh-comment-id:2323555667 --> @pdevine commented on GitHub (Sep 2, 2024): I'm going to go ahead and close the issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3787