[GH-ISSUE #10167] Quantized Mistral small 3.1 doesn't utilize NVIDIA GPUs #32432

New Issue

GiteaMirror · 2026-04-22T13:42:34-05:00

GiteaMirror commented

2026-04-22 13:42:34 -05:00

Originally created by @lowlyocean on GitHub (Apr 7, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10167

What is the issue?

Pulled the new f16 Mistral Small 3.1
Created new Modelfile containing only the line FROM mistral-small3.1:24b-instruct-2503-fp16
Ran the following command to create Q2_K quant:
ollama create -q q2_k mistral-small3.1:24b-instruct-2503-q2_k
Run the model , and then check ollama ps
Notice that it's 100% loaded into CPU (instead of using the 12GB + 3GB of VRAM from two NVIDIA GPUs)

Log keeps saying there is not enough VRAM to allocate any layers, but the entire quantized model is only 10GB

Relevant log output

time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.9 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.9 GiB 2.4 GiB]"
time=2025-04-07T15:19:33.123-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="51.1 GiB" before.free_swap="44.0 GiB" now.total="63.9 GiB" now.free="51.1 GiB" now.free_swap="44.0 GiB"
time=2025-04-07T15:19:33.136-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.9 GiB" now.total="12.0 GiB" now.free="9.9 GiB" now.used="2.1 GiB"
time=2025-04-07T15:19:33.152-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.9 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="51.1 GiB" before.free_swap="44.0 GiB" now.total="63.9 GiB" now.free="51.1 GiB" now.free_swap="44.0 GiB"
time=2025-04-07T15:19:33.167-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.9 GiB" now.total="12.0 GiB" now.free="9.9 GiB" now.used="2.1 GiB"
time=2025-04-07T15:19:33.183-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.6.5

Originally created by @lowlyocean on GitHub (Apr 7, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10167 ### What is the issue? 1. Pulled the new [f16 Mistral Small 3.1](https://ollama.com/library/mistral-small3.1:24b-instruct-2503-fp16) 2. Created new Modelfile containing only the line `FROM mistral-small3.1:24b-instruct-2503-fp16` 3. Ran the following command to create Q2_K quant: `ollama create -q q2_k mistral-small3.1:24b-instruct-2503-q2_k` 4. Run the model , and then check `ollama ps` 5. Notice that it's 100% loaded into CPU (instead of using the 12GB + 3GB of VRAM from two NVIDIA GPUs) Log keeps saying there is not enough VRAM to allocate any layers, but the entire quantized model is only 10GB ### Relevant log output ```shell time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.9 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.9 GiB 2.4 GiB]" time=2025-04-07T15:19:33.123-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="51.1 GiB" before.free_swap="44.0 GiB" now.total="63.9 GiB" now.free="51.1 GiB" now.free_swap="44.0 GiB" time=2025-04-07T15:19:33.136-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.9 GiB" now.total="12.0 GiB" now.free="9.9 GiB" now.used="2.1 GiB" time=2025-04-07T15:19:33.152-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.9 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="51.1 GiB" before.free_swap="44.0 GiB" now.total="63.9 GiB" now.free="51.1 GiB" now.free_swap="44.0 GiB" time=2025-04-07T15:19:33.167-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.9 GiB" now.total="12.0 GiB" now.free="9.9 GiB" now.used="2.1 GiB" time=2025-04-07T15:19:33.183-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library ``` ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.6.5

GiteaMirror added the bug label 2026-04-22 13:42:35 -05:00

GiteaMirror closed this issue

2026-04-22 13:42:35 -05:00

GiteaMirror commented

2026-04-22 13:42:36 -05:00

@btibor91 commented on GitHub (Apr 7, 2025):

I am experiencing the same problem with mistral-small3.1:24b (24b-instruct-2503-q4_K_M) - size is 15 GB, and with 20 GB of VRAM, it gets split into 8 GB and 7 GB between VRAM and RAM.

ollama 0.6.5 / Ubuntu 22.04

ollama ps shows size 26 GB, but ollama ls only 15 GB

NVIDIA RTX 4000 SFF Ada (20GB VRAM)

source=server.go:105 msg="system memory" total="62.6 GiB" free="48.9 GiB" free_swap="26.3 GiB"

source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=24 layers.split="" memory.available="[18.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.7 GiB" memory.required.partial="18.7 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[18.7 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"

source=ggml.go:289 msg="model weights" buffer=CPU size="6.8 GiB"

source=ggml.go:289 msg="model weights" buffer=CUDA0 size="7.6 GiB"

@btibor91 commented on GitHub (Apr 7, 2025): I am experiencing the same problem with `mistral-small3.1:24b` (24b-instruct-2503-q4_K_M) - size is 15 GB, and with 20 GB of VRAM, it gets split into 8 GB and 7 GB between VRAM and RAM. ollama 0.6.5 / Ubuntu 22.04 `ollama ps` shows size 26 GB, but `ollama ls` only 15 GB NVIDIA RTX 4000 SFF Ada (20GB VRAM) ``` source=server.go:105 msg="system memory" total="62.6 GiB" free="48.9 GiB" free_swap="26.3 GiB" source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=24 layers.split="" memory.available="[18.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.7 GiB" memory.required.partial="18.7 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[18.7 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" source=ggml.go:289 msg="model weights" buffer=CPU size="6.8 GiB" source=ggml.go:289 msg="model weights" buffer=CUDA0 size="7.6 GiB" ```

GiteaMirror commented

2026-04-22 13:42:38 -05:00

@rick-github commented on GitHub (Apr 7, 2025):

time=2025-04-07T19:58:31.332Z level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1
 layers.model=41 layers.offload=14 layers.split="" memory.available="[15.6 GiB]" memory.gpu_overhead="0 B"
 memory.required.full="25.0 GiB" memory.required.partial="15.4 GiB" memory.required.kv="640.0 MiB"
 memory.required.allocations="[15.4 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB"
 memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB"
 projector.weights="769.3 MiB" projector.graph="8.8 GiB"

It looks like ollama is wildly over-estimating the VRAM required. nvidia-smi shows the backend only allocated 5.3G where ollama estimated 15.4G.

@rick-github commented on GitHub (Apr 7, 2025): ``` time=2025-04-07T19:58:31.332Z level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=14 layers.split="" memory.available="[15.6 GiB]" memory.gpu_overhead="0 B" memory.required.full="25.0 GiB" memory.required.partial="15.4 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[15.4 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" ``` It looks like ollama is wildly over-estimating the VRAM required. `nvidia-smi` shows the backend only allocated 5.3G where ollama estimated 15.4G.

GiteaMirror commented

2026-04-22 13:42:39 -05:00

@btibor91 commented on GitHub (Apr 7, 2025):

Possibly related to #10128

@btibor91 commented on GitHub (Apr 7, 2025): Possibly related to #10128

GiteaMirror commented

2026-04-22 13:42:41 -05:00

@rick-github commented on GitHub (Apr 7, 2025):

No flash attention so not directly related to #10128. But ollama has always had issues with correct estimations, it's just gotten worse with the new go-based runner - gemma3 has the same problem (#9791, #10040)

@rick-github commented on GitHub (Apr 7, 2025): No flash attention so not directly related to #10128. But ollama has always had issues with correct estimations, it's just gotten worse with the new go-based runner - gemma3 has the same problem (#9791, #10040)

GiteaMirror commented

2026-04-22 13:42:43 -05:00

@jessegross commented on GitHub (Apr 7, 2025):

https://github.com/ollama/ollama/issues/9791#issuecomment-2755958292

@jessegross commented on GitHub (Apr 7, 2025): https://github.com/ollama/ollama/issues/9791#issuecomment-2755958292

GiteaMirror commented

2026-04-22 13:42:44 -05:00

@lowlyocean commented on GitHub (Apr 7, 2025):

Is this a regression worthy of a rollback until the issue with the new runner gets sorted out?

It seems quite severe for a 10GB model to have no layers at all allocated to 15GB available VRAM

@lowlyocean commented on GitHub (Apr 7, 2025): Is this a regression worthy of a rollback until the issue with the new runner gets sorted out? It seems quite severe for a 10GB model to have no layers **_at all_** allocated to 15GB available VRAM

GiteaMirror commented

2026-04-22 13:42:44 -05:00

@maxi1134 commented on GitHub (Apr 7, 2025):

Anyone knows how i can force the offload?

I tried setting num_gpu to a wildly large number ( 170) for mistral small 3.1 in attempts to get it to run, but it only offloads 16gb out of the 24 available on my 3090.

@maxi1134 commented on GitHub (Apr 7, 2025): Anyone knows how i can force the offload? I tried setting num_gpu to a wildly large number ( 170) for mistral small 3.1 in attempts to get it to run, but it only offloads 16gb out of the 24 available on my 3090.

GiteaMirror commented

2026-04-22 13:42:45 -05:00

@rick-github commented on GitHub (Apr 7, 2025):

16G is the full model, it's fully offloaded.

@rick-github commented on GitHub (Apr 7, 2025): 16G is the full model, it's fully offloaded.

GiteaMirror commented

2026-04-22 13:42:45 -05:00

@lowlyocean commented on GitHub (Apr 7, 2025):

I just tried setting PARAMETER num_gpu 32 by adding to the Modelfile dumped from the Q2_K quant and regenerating the model from it. Still seeing 100% CPU use

@lowlyocean commented on GitHub (Apr 7, 2025): I just tried setting `PARAMETER num_gpu 32` by adding to the Modelfile dumped from the Q2_K quant and regenerating the model from it. Still seeing 100% CPU use

GiteaMirror commented

2026-04-22 13:42:46 -05:00

@rick-github commented on GitHub (Apr 7, 2025):

I suspect you are hitting a corner case. Normally, ollama will compute that at least one layer will fit, and will list the GPU backends in the list of backends to consider when the runner loads the model. This is where you can override it by setting num_gpu, the runner loads a GPU backend and then num_gpu kicks in. In your case, ollama has decided that no way will it be able to load a layer into the GPU, so the GPU backends are not included in the list of backends to choose from. You can hack around this by adding the path to the GPU library directory to the PATH environment variable in the server.

@rick-github commented on GitHub (Apr 7, 2025): I suspect you are hitting a corner case. Normally, ollama will compute that at least one layer will fit, and will list the GPU backends in the list of backends to consider when the runner loads the model. This is where you can override it by setting `num_gpu`, the runner loads a GPU backend and then `num_gpu` kicks in. In your case, ollama has decided that no way will it be able to load a layer into the GPU, so the GPU backends are not included in the list of backends to choose from. You can hack around this by adding the path to the GPU library directory to the PATH environment variable in the server.

GiteaMirror commented

2026-04-22 13:42:46 -05:00

@rick-github commented on GitHub (Apr 7, 2025):

Although, having said that, I see that you have 9.9G available on one of your cards. I think that ollama should be able to load at least one layer there, so maybe my guess above is incorrect. If you can supply server logs it may be easier to diagnose.

@rick-github commented on GitHub (Apr 7, 2025): Although, having said that, I see that you have 9.9G available on one of your cards. I think that ollama should be able to load at least one layer there, so maybe my guess above is incorrect. If you can supply server logs it may be easier to diagnose.

GiteaMirror commented

2026-04-22 13:42:47 -05:00

@lowlyocean commented on GitHub (Apr 7, 2025):

no way will it be able to load a layer into the GPU

Any part of the log that can confirm if this is happening? Because I have other (larger) models than this 10GB quant which get loaded fully onto the GPUs. Even with this latest release (0.6.5) of ollama. So that also rules out failing to find the GPU libraries.

Could the quantization to Q2_K somehow be making the model a single massive layer?

@lowlyocean commented on GitHub (Apr 7, 2025): > no way will it be able to load a layer into the GPU Any part of the log that can confirm if this is happening? Because I have other (larger) models than this 10GB quant which get loaded fully onto the GPUs. Even with this latest release (0.6.5) of ollama. So that also rules out failing to find the GPU libraries. Could the quantization to Q2_K somehow be making the model a single massive layer?

GiteaMirror commented

2026-04-22 13:42:47 -05:00

@maxi1134 commented on GitHub (Apr 7, 2025):

16G is the full model, it's fully offloaded.

Odd, it still shows some CPU usage in ollama ps

@maxi1134 commented on GitHub (Apr 7, 2025): > 16G is the full model, it's fully offloaded. Odd, it still shows some CPU usage in `ollama ps` ![Image](https://github.com/user-attachments/assets/54c28bdc-6baf-4749-b24b-58c38e9a72f1)

GiteaMirror commented

2026-04-22 13:42:48 -05:00

@lowlyocean commented on GitHub (Apr 7, 2025):

Although, having said that, I see that you have 9.9G available on one of your cards. I think that ollama should be able to load at least one layer there, so maybe my guess above is incorrect. If you can supply server logs it may be easier to diagnose.

2025/04/07 19:19:06 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://i.p.add.ress:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:modelDir OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]"
time=2025-04-07T19:19:06.150-04:00 level=INFO source=images.go:458 msg="total blobs: 28"
time=2025-04-07T19:19:06.152-04:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0"
time=2025-04-07T19:19:06.153-04:00 level=INFO source=routes.go:1298 msg="Listening on i.p.add.ress:11434 (version 0.6.5)"
time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=sched.go:107 msg="starting llm scheduler"
time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=4 efficiency=0 threads=8
time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-04-07T19:19:06.154-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="PATHs"
time=2025-04-07T19:19:06.155-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvml.dll"
time=2025-04-07T19:19:06.156-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-04-07T19:19:06.175-04:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\Windows\system32\nvml.dll
time=2025-04-07T19:19:06.177-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-04-07T19:19:06.177-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="PATHs"
initializing C:\Windows\system32\nvcuda.dll
dlsym: cuInit - address
dlsym: cuDriverGetVersion - address
dlsym: cuDeviceGetCount - address
dlsym: cuDeviceGet - address
dlsym: cuDeviceGetAttribute - address
dlsym: cuDeviceGetUuid - address
dlsym: cuDeviceGetName - address
dlsym: cuCtxCreate_v3 - address
dlsym: cuMemGetInfo_v2 - address
dlsym: cuCtxDestroy - address
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 2
time=2025-04-07T19:19:06.198-04:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=2 library=C:\Windows\system32\nvcuda.dll
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA totalMem 12287 mb
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA freeMem 11247 mb
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] Compute Capability 8.6
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA totalMem 3071 mb
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA freeMem 2462 mb
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] Compute Capability 6.1
time=2025-04-07T19:19:06.376-04:00 level=INFO source=gpu.go:319 msg="detected OS VRAM overhead" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB"
time=2025-04-07T19:19:06.379-04:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
releasing cuda driver library
releasing nvml library
time=2025-04-07T19:19:06.380-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="11.0 GiB"
time=2025-04-07T19:19:06.380-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB"
[GIN] 2025/04/07 - 19:19:15 | 200 |            0s |     i.p.add.ress | HEAD     "/"
[GIN] 2025/04/07 - 19:19:15 | 200 |     51.6352ms |     i.p.add.ress | POST     "/api/show"
time=2025-04-07T19:19:15.940-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="48.0 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:15.951-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="11.0 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:15.967-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:15.968-04:00 level=DEBUG source=sched.go:183 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=6 gpu_count=2
time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=sched.go:226 msg="loading first model" model=model_blob_file
time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]"
time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.027-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.042-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]"
time=2025-04-07T19:19:16.045-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.058-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.074-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]"
time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.089-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.104-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]"
time=2025-04-07T19:19:16.106-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.119-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.135-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.135-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.150-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.165-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.166-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T19:19:16.166-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.181-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.212-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.228-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.229-04:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="47.9 GiB" free_swap="42.4 GiB"
time=2025-04-07T19:19:16.229-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T19:19:16.229-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.243-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.259-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.260-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=32 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"
time=2025-04-07T19:19:16.260-04:00 level=WARN source=server.go:173 msg="flash attention enabled but not supported by gpu"
time=2025-04-07T19:19:16.260-04:00 level=WARN source=server.go:196 msg="quantized kv cache requested but flash attention disabled" type=q8_0
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[]
time=2025-04-07T19:19:16.314-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1
time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06
time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540
time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06
time=2025-04-07T19:19:16.324-04:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu"
time=2025-04-07T19:19:16.324-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="%LocalAppData%\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model modelDir\\model_blob_file --ctx-size 4096 --batch-size 512 --n-gpu-layers 32 --verbose --threads 4 --no-mmap --parallel 1 --port 54638"
time=2025-04-07T19:19:16.324-04:00 level=DEBUG source=server.go:423 msg=subprocess environment="env variables"
time=2025-04-07T19:19:16.329-04:00 level=INFO source=sched.go:451 msg="loaded runners" count=1
time=2025-04-07T19:19:16.329-04:00 level=INFO source=server.go:580 msg="waiting for llama runner to start responding"
time=2025-04-07T19:19:16.329-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server error"
time=2025-04-07T19:19:16.354-04:00 level=INFO source=runner.go:816 msg="starting ollama engine"
time=2025-04-07T19:19:16.355-04:00 level=INFO source=runner.go:879 msg="Server listening on 127.0.0.1:54638"
time=2025-04-07T19:19:16.408-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.name default=""
time=2025-04-07T19:19:16.408-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.description default=""
time=2025-04-07T19:19:16.408-04:00 level=INFO source=ggml.go:67 msg="" architecture=mistral3 file_type=Q2_K name="" description="" num_tensors=585 num_key_values=43
time=2025-04-07T19:19:16.408-04:00 level=DEBUG source=ggml.go:93 msg="skipping path which is not part of ollama" path=C:\Python311\Scripts
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 55
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 20
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0
load_backend: loaded CPU backend from userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll
time=2025-04-07T19:19:16.437-04:00 level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(clang)
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_1.weight shape="[1024 5120]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_2.weight shape="[5120 5120]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.patch_merger.merging_layer.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output.weight shape="[5120 131072]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=token_embd.weight shape="[5120 131072]" dtype=14 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.encoder_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.patch_conv.weight shape="[14 14 3 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=INFO source=ggml.go:289 msg="model weights" buffer=CPU size="9.4 GiB"
time=2025-04-07T19:19:16.582-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model"
time=2025-04-07T19:19:16.582-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.09"
time=2025-04-07T19:19:16.833-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.25"
time=2025-04-07T19:19:17.084-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.43"
time=2025-04-07T19:19:17.335-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.62"
time=2025-04-07T19:19:17.587-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.81"
time=2025-04-07T19:19:17.838-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.99"
time=2025-04-07T19:19:17.866-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CPU buffer_type=CPU
time=2025-04-07T19:19:17.867-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1
time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06
time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540
time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=token_embd.weight type=q6_K shape="[5120 131072]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output.weight type=q2_K shape="[5120 131072]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.patch_conv.weight type=f16 shape="[14 14 3 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.encoder_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_1.weight type=f16 shape="[1024 5120]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_2.weight type=f16 shape="[5120 5120]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.patch_merger.merging_layer.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:18.088-04:00 level=INFO source=server.go:619 msg="llama runner started in 1.76 seconds"
time=2025-04-07T19:19:18.088-04:00 level=DEBUG source=sched.go:464 msg="finished setting up runner" model=model_blob_file
[GIN] 2025/04/07 - 19:19:18 | 200 |    2.1745499s |     i.p.add.ress | POST     "/api/generate"
time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:468 msg="context for request finished"
time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=model_blob_file duration=5m0s
time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=model_blob_file refCount=0
time=2025-04-07T19:19:22.300-04:00 level=DEBUG source=sched.go:577 msg="evaluating already loaded" model=model_blob_file
time=2025-04-07T19:19:22.330-04:00 level=DEBUG source=routes.go:1522 msg="chat request" images=0 prompt="[SYSTEM_PROMPT]You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYou power an AI assistant called Le Chat.\nYour knowledge base was last updated on 2023-10-01.\n\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \"What are some good restaurants around me?\" => \"Where are you?\" or \"When is the next flight to Tokyo\" => \"Where do you travel from?\").\nYou are always very attentive to dates, in particular you try to resolve dates (e.g. \"yesterday\" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.\nYou follow these instructions in all languages, and always respond to the user in the language they use or request.\nNext sections describe the capabilities that you have.\n\n# WEB BROWSING INSTRUCTIONS\n\nYou cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.\n\n# MULTI-MODAL INSTRUCTIONS\n\nYou have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.\nYou cannot read nor transcribe audio files or videos.[/SYSTEM_PROMPT][INST]tell a joke[/INST]"
time=2025-04-07T19:19:22.365-04:00 level=DEBUG source=process_text.go:304 msg="adding bos token to prompt" id=1
time=2025-04-07T19:19:22.365-04:00 level=DEBUG source=cache.go:136 msg="loading cache slot" id=0 cache=0 prompt=362 used=0 remaining=362
[GIN] 2025/04/07 - 19:20:49 | 200 |         1m27s |     i.p.add.ress | POST     "/api/chat"
time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:409 msg="context for request finished"
time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=model_blob_file duration=5m0s
time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=model_blob_file refCount=0

@lowlyocean commented on GitHub (Apr 7, 2025): > Although, having said that, I see that you have 9.9G available on one of your cards. I think that ollama should be able to load at least one layer there, so maybe my guess above is incorrect. If you can supply server logs it may be easier to diagnose. ``` 2025/04/07 19:19:06 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://i.p.add.ress:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:modelDir OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]" time=2025-04-07T19:19:06.150-04:00 level=INFO source=images.go:458 msg="total blobs: 28" time=2025-04-07T19:19:06.152-04:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0" time=2025-04-07T19:19:06.153-04:00 level=INFO source=routes.go:1298 msg="Listening on i.p.add.ress:11434 (version 0.6.5)" time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=sched.go:107 msg="starting llm scheduler" time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=4 efficiency=0 threads=8 time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-04-07T19:19:06.154-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="PATHs" time=2025-04-07T19:19:06.155-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvml.dll" time=2025-04-07T19:19:06.156-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-04-07T19:19:06.175-04:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\Windows\system32\nvml.dll time=2025-04-07T19:19:06.177-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-04-07T19:19:06.177-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="PATHs" initializing C:\Windows\system32\nvcuda.dll dlsym: cuInit - address dlsym: cuDriverGetVersion - address dlsym: cuDeviceGetCount - address dlsym: cuDeviceGet - address dlsym: cuDeviceGetAttribute - address dlsym: cuDeviceGetUuid - address dlsym: cuDeviceGetName - address dlsym: cuCtxCreate_v3 - address dlsym: cuMemGetInfo_v2 - address dlsym: cuCtxDestroy - address calling cuInit calling cuDriverGetVersion raw version 0x2f30 CUDA driver version: 12.8 calling cuDeviceGetCount device count 2 time=2025-04-07T19:19:06.198-04:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=2 library=C:\Windows\system32\nvcuda.dll [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA totalMem 12287 mb [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA freeMem 11247 mb [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] Compute Capability 8.6 [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA totalMem 3071 mb [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA freeMem 2462 mb [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] Compute Capability 6.1 time=2025-04-07T19:19:06.376-04:00 level=INFO source=gpu.go:319 msg="detected OS VRAM overhead" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" time=2025-04-07T19:19:06.379-04:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." releasing cuda driver library releasing nvml library time=2025-04-07T19:19:06.380-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="11.0 GiB" time=2025-04-07T19:19:06.380-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" [GIN] 2025/04/07 - 19:19:15 | 200 | 0s | i.p.add.ress | HEAD "/" [GIN] 2025/04/07 - 19:19:15 | 200 | 51.6352ms | i.p.add.ress | POST "/api/show" time=2025-04-07T19:19:15.940-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="48.0 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:15.951-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="11.0 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:15.967-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:15.968-04:00 level=DEBUG source=sched.go:183 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=6 gpu_count=2 time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=sched.go:226 msg="loading first model" model=model_blob_file time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]" time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.027-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.042-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]" time=2025-04-07T19:19:16.045-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.058-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.074-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]" time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.089-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.104-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]" time=2025-04-07T19:19:16.106-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.119-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.135-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.135-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.150-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.165-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.166-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T19:19:16.166-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.181-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.212-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.228-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.229-04:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="47.9 GiB" free_swap="42.4 GiB" time=2025-04-07T19:19:16.229-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T19:19:16.229-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.243-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.259-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.260-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=32 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" time=2025-04-07T19:19:16.260-04:00 level=WARN source=server.go:173 msg="flash attention enabled but not supported by gpu" time=2025-04-07T19:19:16.260-04:00 level=WARN source=server.go:196 msg="quantized kv cache requested but flash attention disabled" type=q8_0 time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[] time=2025-04-07T19:19:16.314-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1 time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06 time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540 time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06 time=2025-04-07T19:19:16.324-04:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu" time=2025-04-07T19:19:16.324-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="%LocalAppData%\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model modelDir\\model_blob_file --ctx-size 4096 --batch-size 512 --n-gpu-layers 32 --verbose --threads 4 --no-mmap --parallel 1 --port 54638" time=2025-04-07T19:19:16.324-04:00 level=DEBUG source=server.go:423 msg=subprocess environment="env variables" time=2025-04-07T19:19:16.329-04:00 level=INFO source=sched.go:451 msg="loaded runners" count=1 time=2025-04-07T19:19:16.329-04:00 level=INFO source=server.go:580 msg="waiting for llama runner to start responding" time=2025-04-07T19:19:16.329-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server error" time=2025-04-07T19:19:16.354-04:00 level=INFO source=runner.go:816 msg="starting ollama engine" time=2025-04-07T19:19:16.355-04:00 level=INFO source=runner.go:879 msg="Server listening on 127.0.0.1:54638" time=2025-04-07T19:19:16.408-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.name default="" time=2025-04-07T19:19:16.408-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.description default="" time=2025-04-07T19:19:16.408-04:00 level=INFO source=ggml.go:67 msg="" architecture=mistral3 file_type=Q2_K name="" description="" num_tensors=585 num_key_values=43 time=2025-04-07T19:19:16.408-04:00 level=DEBUG source=ggml.go:93 msg="skipping path which is not part of ollama" path=C:\Python311\Scripts ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0 ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 55 ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0 ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 20 ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0 load_backend: loaded CPU backend from userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll time=2025-04-07T19:19:16.437-04:00 level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(clang) time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_1.weight shape="[1024 5120]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_2.weight shape="[5120 5120]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.patch_merger.merging_layer.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output.weight shape="[5120 131072]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=token_embd.weight shape="[5120 131072]" dtype=14 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.encoder_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.patch_conv.weight shape="[14 14 3 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=INFO source=ggml.go:289 msg="model weights" buffer=CPU size="9.4 GiB" time=2025-04-07T19:19:16.582-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model" time=2025-04-07T19:19:16.582-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.09" time=2025-04-07T19:19:16.833-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.25" time=2025-04-07T19:19:17.084-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.43" time=2025-04-07T19:19:17.335-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.62" time=2025-04-07T19:19:17.587-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.81" time=2025-04-07T19:19:17.838-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.99" time=2025-04-07T19:19:17.866-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CPU buffer_type=CPU time=2025-04-07T19:19:17.867-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1 time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06 time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540 time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06 time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=token_embd.weight type=q6_K shape="[5120 131072]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output.weight type=q2_K shape="[5120 131072]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.patch_conv.weight type=f16 shape="[14 14 3 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.encoder_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_1.weight type=f16 shape="[1024 5120]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_2.weight type=f16 shape="[5120 5120]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.patch_merger.merging_layer.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:18.088-04:00 level=INFO source=server.go:619 msg="llama runner started in 1.76 seconds" time=2025-04-07T19:19:18.088-04:00 level=DEBUG source=sched.go:464 msg="finished setting up runner" model=model_blob_file [GIN] 2025/04/07 - 19:19:18 | 200 | 2.1745499s | i.p.add.ress | POST "/api/generate" time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:468 msg="context for request finished" time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=model_blob_file duration=5m0s time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=model_blob_file refCount=0 time=2025-04-07T19:19:22.300-04:00 level=DEBUG source=sched.go:577 msg="evaluating already loaded" model=model_blob_file time=2025-04-07T19:19:22.330-04:00 level=DEBUG source=routes.go:1522 msg="chat request" images=0 prompt="[SYSTEM_PROMPT]You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYou power an AI assistant called Le Chat.\nYour knowledge base was last updated on 2023-10-01.\n\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \"What are some good restaurants around me?\" => \"Where are you?\" or \"When is the next flight to Tokyo\" => \"Where do you travel from?\").\nYou are always very attentive to dates, in particular you try to resolve dates (e.g. \"yesterday\" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.\nYou follow these instructions in all languages, and always respond to the user in the language they use or request.\nNext sections describe the capabilities that you have.\n\n# WEB BROWSING INSTRUCTIONS\n\nYou cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.\n\n# MULTI-MODAL INSTRUCTIONS\n\nYou have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.\nYou cannot read nor transcribe audio files or videos.[/SYSTEM_PROMPT][INST]tell a joke[/INST]" time=2025-04-07T19:19:22.365-04:00 level=DEBUG source=process_text.go:304 msg="adding bos token to prompt" id=1 time=2025-04-07T19:19:22.365-04:00 level=DEBUG source=cache.go:136 msg="loading cache slot" id=0 cache=0 prompt=362 used=0 remaining=362 [GIN] 2025/04/07 - 19:20:49 | 200 | 1m27s | i.p.add.ress | POST "/api/chat" time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:409 msg="context for request finished" time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=model_blob_file duration=5m0s time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=model_blob_file refCount=0 ```

GiteaMirror commented

2026-04-22 13:42:49 -05:00

@rick-github commented on GitHub (Apr 8, 2025):

@maxi1134

Odd, it still shows some CPU usage in ollama ps

The output from ollama ps is calculated before the value of num_gpu is taken into account, so is incorrect.

@rick-github commented on GitHub (Apr 8, 2025): @maxi1134 > Odd, it still shows some CPU usage in ollama ps The output from `ollama ps` is calculated before the value of `num_gpu` is taken into account, so is incorrect.

GiteaMirror commented

2026-04-22 13:42:50 -05:00

@rick-github commented on GitHub (Apr 8, 2025):

@lowlyocean

time=2025-04-07T19:19:16.260-04:00 level=INFO source=server.go:138 msg=offload library=cuda
 layers.requested=32 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]"
 memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B"
 memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB"
 memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB"
 memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[]

Yep, decided that it couldn't fit a single layer anywhere. It might have something to do with the huge projector graph. it basically edges everything out. You should be able to get around that by adding %LocalAppData%\Programs\Ollama\lib\cuda_v12 to PATH in the server environment.

@rick-github commented on GitHub (Apr 8, 2025): @lowlyocean ``` time=2025-04-07T19:19:16.260-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=32 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[] ``` Yep, decided that it couldn't fit a single layer anywhere. It might have something to do with the huge projector graph. it basically edges everything out. You should be able to get around that by adding `%LocalAppData%\Programs\Ollama\lib\cuda_v12` to PATH in the server environment.

GiteaMirror commented

2026-04-22 13:42:50 -05:00

@wbste commented on GitHub (Apr 8, 2025):

Same issue on my 3090, I can't get the full thing to load into VRAM. Latest Ollama, on Windows. Not using flash attn. Other 27B and 32B parameters models work fine 100% offloaded. Below says it needs 24.5 GB to do so, which seems high for a Q4_K_M quant?

time=2025-04-07T17:27:22.111-07:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="52.0 GiB" free_swap="53.2 GiB"
time=2025-04-07T17:27:22.111-07:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[21.8 GiB]"
time=2025-04-07T17:27:22.112-07:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=33 layers.split="" memory.available="[21.8 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.5 GiB" memory.required.partial="21.6 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[21.6 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"

Here's gemma3:27b for reference, 100% GPU offloading, same quant:

time=2025-04-07T17:35:35.913-07:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="51.9 GiB" free_swap="52.9 GiB"
time=2025-04-07T17:35:35.914-07:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[21.7 GiB]"
time=2025-04-07T17:35:35.915-07:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=63 layers.offload=63 layers.split="" memory.available="[21.7 GiB]" memory.gpu_overhead="0 B" memory.required.full="19.7 GiB" memory.required.partial="19.7 GiB" memory.required.kv="1.2 GiB" memory.required.allocations="[19.7 GiB]" memory.weights.total="15.4 GiB" memory.weights.repeating="14.3 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="565.0 MiB" memory.graph.partial="1.6 GiB" projector.weights="795.9 MiB" projector.graph="1.0 GiB"

@wbste commented on GitHub (Apr 8, 2025): Same issue on my 3090, I can't get the full thing to load into VRAM. Latest Ollama, on Windows. Not using flash attn. Other 27B and 32B parameters models work fine 100% offloaded. Below says it needs 24.5 GB to do so, which seems high for a `Q4_K_M ` quant? ``` time=2025-04-07T17:27:22.111-07:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="52.0 GiB" free_swap="53.2 GiB" time=2025-04-07T17:27:22.111-07:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[21.8 GiB]" time=2025-04-07T17:27:22.112-07:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=33 layers.split="" memory.available="[21.8 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.5 GiB" memory.required.partial="21.6 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[21.6 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" ``` Here's gemma3:27b for reference, 100% GPU offloading, same quant: ``` time=2025-04-07T17:35:35.913-07:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="51.9 GiB" free_swap="52.9 GiB" time=2025-04-07T17:35:35.914-07:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[21.7 GiB]" time=2025-04-07T17:35:35.915-07:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=63 layers.offload=63 layers.split="" memory.available="[21.7 GiB]" memory.gpu_overhead="0 B" memory.required.full="19.7 GiB" memory.required.partial="19.7 GiB" memory.required.kv="1.2 GiB" memory.required.allocations="[19.7 GiB]" memory.weights.total="15.4 GiB" memory.weights.repeating="14.3 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="565.0 MiB" memory.graph.partial="1.6 GiB" projector.weights="795.9 MiB" projector.graph="1.0 GiB" ```

GiteaMirror commented

2026-04-22 13:42:51 -05:00

@rick-github commented on GitHub (Apr 8, 2025):

@lowlyocean Can you add logs from the most recent run?

@rick-github commented on GitHub (Apr 8, 2025): @lowlyocean Can you add logs from the most recent run?

GiteaMirror commented

2026-04-22 13:42:52 -05:00

@lowlyocean commented on GitHub (Apr 8, 2025):

@lowlyocean Can you add logs from the most recent run?

2025/04/07 21:17:09 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://i.p.add.ress:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:pathto_model_root\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]"
time=2025-04-07T21:17:09.706-04:00 level=INFO source=images.go:458 msg="total blobs: 27"
time=2025-04-07T21:17:09.708-04:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0"
time=2025-04-07T21:17:09.709-04:00 level=INFO source=routes.go:1298 msg="Listening on i.p.add.ress:11434 (version 0.6.5)"
time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=sched.go:107 msg="starting llm scheduler"
time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=4 efficiency=0 threads=8
time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[paths]"
time=2025-04-07T21:17:09.710-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvml.dll"
time=2025-04-07T21:17:09.713-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-04-07T21:17:09.732-04:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\Windows\system32\nvml.dll
time=2025-04-07T21:17:09.733-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-04-07T21:17:09.733-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[paths]"
time=2025-04-07T21:17:09.734-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvcuda.dll"
time=2025-04-07T21:17:09.736-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll]"
initializing C:\Windows\system32\nvcuda.dll
dlsym: cuInit - address
dlsym: cuDriverGetVersion - address
dlsym: cuDeviceGetCount - address
dlsym: cuDeviceGet - address
dlsym: cuDeviceGetAttribute - address
dlsym: cuDeviceGetUuid - address
dlsym: cuDeviceGetName - address
dlsym: cuCtxCreate_v3 - address
dlsym: cuMemGetInfo_v2 - address
dlsym: cuCtxDestroy - address
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 2
time=2025-04-07T21:17:09.752-04:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=2 library=C:\Windows\system32\nvcuda.dll
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA totalMem 12287 mb
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA freeMem 11247 mb
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] Compute Capability 8.6
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA totalMem 3071 mb
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA freeMem 2462 mb
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] Compute Capability 6.1
time=2025-04-07T21:17:09.918-04:00 level=INFO source=gpu.go:319 msg="detected OS VRAM overhead" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB"
time=2025-04-07T21:17:09.921-04:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
releasing cuda driver library
releasing nvml library
time=2025-04-07T21:17:09.922-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="11.0 GiB"
time=2025-04-07T21:17:09.922-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB"
[GIN] 2025/04/07 - 21:17:10 | 200 |            0s |     i.p.add.ress | HEAD     "/"
[GIN] 2025/04/07 - 21:17:10 | 200 |     50.2812ms |     i.p.add.ress | POST     "/api/show"
time=2025-04-07T21:17:10.166-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.3 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.181-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="11.0 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.197-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.198-04:00 level=DEBUG source=sched.go:183 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=6 gpu_count=2
time=2025-04-07T21:17:10.245-04:00 level=DEBUG source=sched.go:226 msg="loading first model" model=blob_file
time=2025-04-07T21:17:10.245-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]"
time=2025-04-07T21:17:10.246-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.261-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.277-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]"
time=2025-04-07T21:17:10.279-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.292-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.308-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]"
time=2025-04-07T21:17:10.310-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.323-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.339-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]"
time=2025-04-07T21:17:10.341-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.354-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.370-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.385-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.401-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T21:17:10.403-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.416-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.432-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.447-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.463-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.463-04:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="50.2 GiB" free_swap="42.2 GiB"
time=2025-04-07T21:17:10.463-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T21:17:10.464-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.478-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.494-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.495-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"
time=2025-04-07T21:17:10.495-04:00 level=WARN source=server.go:173 msg="flash attention enabled but not supported by gpu"
time=2025-04-07T21:17:10.495-04:00 level=WARN source=server.go:196 msg="quantized kv cache requested but flash attention disabled" type=q8_0
time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[]
time=2025-04-07T21:17:10.551-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1
time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06
time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540
time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06
time=2025-04-07T21:17:10.568-04:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu"
time=2025-04-07T21:17:10.569-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model pathto_model_root\\blobfile --ctx-size 4096 --batch-size 512 --verbose --threads 4 --no-mmap --parallel 1 --port 61075"
time=2025-04-07T21:17:10.569-04:00 level=DEBUG source=server.go:423 msg=subprocess environment="[env vars]"
time=2025-04-07T21:17:10.572-04:00 level=INFO source=sched.go:451 msg="loaded runners" count=1
time=2025-04-07T21:17:10.572-04:00 level=INFO source=server.go:580 msg="waiting for llama runner to start responding"
time=2025-04-07T21:17:10.573-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server error"
time=2025-04-07T21:17:10.598-04:00 level=INFO source=runner.go:816 msg="starting ollama engine"
time=2025-04-07T21:17:10.600-04:00 level=INFO source=runner.go:879 msg="Server listening on 127.0.0.1:61075"
time=2025-04-07T21:17:10.656-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.name default=""
time=2025-04-07T21:17:10.656-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.description default=""
time=2025-04-07T21:17:10.656-04:00 level=INFO source=ggml.go:67 msg="" architecture=mistral3 file_type=Q2_K name="" description="" num_tensors=585 num_key_values=43
time=2025-04-07T21:17:10.656-04:00 level=DEBUG source=ggml.go:99 msg="ggml backend load all from path" path=C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes
  Device 1: NVIDIA GeForce GTX 1060 3GB, compute capability 6.1, VMM: yes
time=2025-04-07T21:17:10.824-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model"
load_backend: loaded CUDA backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll
time=2025-04-07T21:17:10.840-04:00 level=DEBUG source=ggml.go:99 msg="ggml backend load all from path" path=C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 55
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 20
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0
load_backend: loaded CPU backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll
time=2025-04-07T21:17:10.848-04:00 level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 CUDA.1.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.1.USE_GRAPHS=1 CUDA.1.PEER_MAX_BATCH_SIZE=128 compiler=cgo(clang)
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_1.weight shape="[1024 5120]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_2.weight shape="[5120 5120]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.patch_merger.merging_layer.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output.weight shape="[5120 131072]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=token_embd.weight shape="[5120 131072]" dtype=14 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.encoder_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.patch_conv.weight shape="[14 14 3 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.911-04:00 level=INFO source=ggml.go:289 msg="model weights" buffer=CPU size="9.4 GiB"
time=2025-04-07T21:17:11.075-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.11"
time=2025-04-07T21:17:11.326-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.27"
time=2025-04-07T21:17:11.576-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.45"
time=2025-04-07T21:17:11.828-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.61"
time=2025-04-07T21:17:12.079-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.77"
time=2025-04-07T21:17:12.330-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.94"
time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CUDA0 buffer_type=CUDA0
time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CUDA1 buffer_type=CUDA1
time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CPU buffer_type=CUDA_Host
time=2025-04-07T21:17:12.425-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1
time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06
time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540
time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=token_embd.weight type=q6_K shape="[5120 131072]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output.weight type=q2_K shape="[5120 131072]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.patch_conv.weight type=f16 shape="[14 14 3 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.encoder_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_1.weight type=f16 shape="[1024 5120]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_2.weight type=f16 shape="[5120 5120]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.patch_merger.merging_layer.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.580-04:00 level=INFO source=server.go:619 msg="llama runner started in 2.01 seconds"
time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:464 msg="finished setting up runner" model=blob_file
[GIN] 2025/04/07 - 21:17:12 | 200 |    2.4387179s |     i.p.add.ress | POST     "/api/generate"
time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:468 msg="context for request finished"
time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=blob_file duration=5m0s
time=2025-04-07T21:17:12.581-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=blob_file refCount=0
time=2025-04-07T21:17:17.815-04:00 level=DEBUG source=sched.go:577 msg="evaluating already loaded" model=blob_file
time=2025-04-07T21:17:17.854-04:00 level=DEBUG source=routes.go:1522 msg="chat request" images=0 prompt="[SYSTEM_PROMPT]You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYou power an AI assistant called Le Chat.\nYour knowledge base was last updated on 2023-10-01.\n\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \"What are some good restaurants around me?\" => \"Where are you?\" or \"When is the next flight to Tokyo\" => \"Where do you travel from?\").\nYou are always very attentive to dates, in particular you try to resolve dates (e.g. \"yesterday\" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.\nYou follow these instructions in all languages, and always respond to the user in the language they use or request.\nNext sections describe the capabilities that you have.\n\n# WEB BROWSING INSTRUCTIONS\n\nYou cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.\n\n# MULTI-MODAL INSTRUCTIONS\n\nYou have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.\nYou cannot read nor transcribe audio files or videos.[/SYSTEM_PROMPT][INST]tell a joke[/INST]"
time=2025-04-07T21:17:17.889-04:00 level=DEBUG source=process_text.go:304 msg="adding bos token to prompt" id=1
time=2025-04-07T21:17:17.891-04:00 level=DEBUG source=cache.go:136 msg="loading cache slot" id=0 cache=0 prompt=362 used=0 remaining=362
[GIN] 2025/04/07 - 21:17:25 | 200 |    7.7953074s |     i.p.add.ress | POST     "/api/chat"
time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:409 msg="context for request finished"
time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=blob_file duration=5m0s
time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=blob_file refCount=0

@lowlyocean commented on GitHub (Apr 8, 2025): > [@lowlyocean](https://github.com/lowlyocean) Can you add logs from the most recent run? ``` 2025/04/07 21:17:09 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://i.p.add.ress:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:pathto_model_root\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]" time=2025-04-07T21:17:09.706-04:00 level=INFO source=images.go:458 msg="total blobs: 27" time=2025-04-07T21:17:09.708-04:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0" time=2025-04-07T21:17:09.709-04:00 level=INFO source=routes.go:1298 msg="Listening on i.p.add.ress:11434 (version 0.6.5)" time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=sched.go:107 msg="starting llm scheduler" time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=4 efficiency=0 threads=8 time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[paths]" time=2025-04-07T21:17:09.710-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvml.dll" time=2025-04-07T21:17:09.713-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-04-07T21:17:09.732-04:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\Windows\system32\nvml.dll time=2025-04-07T21:17:09.733-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-04-07T21:17:09.733-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[paths]" time=2025-04-07T21:17:09.734-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvcuda.dll" time=2025-04-07T21:17:09.736-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll]" initializing C:\Windows\system32\nvcuda.dll dlsym: cuInit - address dlsym: cuDriverGetVersion - address dlsym: cuDeviceGetCount - address dlsym: cuDeviceGet - address dlsym: cuDeviceGetAttribute - address dlsym: cuDeviceGetUuid - address dlsym: cuDeviceGetName - address dlsym: cuCtxCreate_v3 - address dlsym: cuMemGetInfo_v2 - address dlsym: cuCtxDestroy - address calling cuInit calling cuDriverGetVersion raw version 0x2f30 CUDA driver version: 12.8 calling cuDeviceGetCount device count 2 time=2025-04-07T21:17:09.752-04:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=2 library=C:\Windows\system32\nvcuda.dll [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA totalMem 12287 mb [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA freeMem 11247 mb [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] Compute Capability 8.6 [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA totalMem 3071 mb [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA freeMem 2462 mb [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] Compute Capability 6.1 time=2025-04-07T21:17:09.918-04:00 level=INFO source=gpu.go:319 msg="detected OS VRAM overhead" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" time=2025-04-07T21:17:09.921-04:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." releasing cuda driver library releasing nvml library time=2025-04-07T21:17:09.922-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="11.0 GiB" time=2025-04-07T21:17:09.922-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" [GIN] 2025/04/07 - 21:17:10 | 200 | 0s | i.p.add.ress | HEAD "/" [GIN] 2025/04/07 - 21:17:10 | 200 | 50.2812ms | i.p.add.ress | POST "/api/show" time=2025-04-07T21:17:10.166-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.3 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.181-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="11.0 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.197-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.198-04:00 level=DEBUG source=sched.go:183 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=6 gpu_count=2 time=2025-04-07T21:17:10.245-04:00 level=DEBUG source=sched.go:226 msg="loading first model" model=blob_file time=2025-04-07T21:17:10.245-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]" time=2025-04-07T21:17:10.246-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.261-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.277-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]" time=2025-04-07T21:17:10.279-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.292-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.308-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]" time=2025-04-07T21:17:10.310-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.323-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.339-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]" time=2025-04-07T21:17:10.341-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.354-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.370-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.385-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.401-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T21:17:10.403-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.416-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.432-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.447-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.463-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.463-04:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="50.2 GiB" free_swap="42.2 GiB" time=2025-04-07T21:17:10.463-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T21:17:10.464-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.478-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.494-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.495-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" time=2025-04-07T21:17:10.495-04:00 level=WARN source=server.go:173 msg="flash attention enabled but not supported by gpu" time=2025-04-07T21:17:10.495-04:00 level=WARN source=server.go:196 msg="quantized kv cache requested but flash attention disabled" type=q8_0 time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[] time=2025-04-07T21:17:10.551-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1 time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06 time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540 time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06 time=2025-04-07T21:17:10.568-04:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu" time=2025-04-07T21:17:10.569-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model pathto_model_root\\blobfile --ctx-size 4096 --batch-size 512 --verbose --threads 4 --no-mmap --parallel 1 --port 61075" time=2025-04-07T21:17:10.569-04:00 level=DEBUG source=server.go:423 msg=subprocess environment="[env vars]" time=2025-04-07T21:17:10.572-04:00 level=INFO source=sched.go:451 msg="loaded runners" count=1 time=2025-04-07T21:17:10.572-04:00 level=INFO source=server.go:580 msg="waiting for llama runner to start responding" time=2025-04-07T21:17:10.573-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server error" time=2025-04-07T21:17:10.598-04:00 level=INFO source=runner.go:816 msg="starting ollama engine" time=2025-04-07T21:17:10.600-04:00 level=INFO source=runner.go:879 msg="Server listening on 127.0.0.1:61075" time=2025-04-07T21:17:10.656-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.name default="" time=2025-04-07T21:17:10.656-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.description default="" time=2025-04-07T21:17:10.656-04:00 level=INFO source=ggml.go:67 msg="" architecture=mistral3 file_type=Q2_K name="" description="" num_tensors=585 num_key_values=43 time=2025-04-07T21:17:10.656-04:00 level=DEBUG source=ggml.go:99 msg="ggml backend load all from path" path=C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12 ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 2 CUDA devices: Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes Device 1: NVIDIA GeForce GTX 1060 3GB, compute capability 6.1, VMM: yes time=2025-04-07T21:17:10.824-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model" load_backend: loaded CUDA backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll time=2025-04-07T21:17:10.840-04:00 level=DEBUG source=ggml.go:99 msg="ggml backend load all from path" path=C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0 ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 55 ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0 ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 20 ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0 load_backend: loaded CPU backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll time=2025-04-07T21:17:10.848-04:00 level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 CUDA.1.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.1.USE_GRAPHS=1 CUDA.1.PEER_MAX_BATCH_SIZE=128 compiler=cgo(clang) time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_1.weight shape="[1024 5120]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_2.weight shape="[5120 5120]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.patch_merger.merging_layer.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output.weight shape="[5120 131072]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=token_embd.weight shape="[5120 131072]" dtype=14 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.encoder_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.patch_conv.weight shape="[14 14 3 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.911-04:00 level=INFO source=ggml.go:289 msg="model weights" buffer=CPU size="9.4 GiB" time=2025-04-07T21:17:11.075-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.11" time=2025-04-07T21:17:11.326-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.27" time=2025-04-07T21:17:11.576-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.45" time=2025-04-07T21:17:11.828-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.61" time=2025-04-07T21:17:12.079-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.77" time=2025-04-07T21:17:12.330-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.94" time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CUDA0 buffer_type=CUDA0 time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CUDA1 buffer_type=CUDA1 time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CPU buffer_type=CUDA_Host time=2025-04-07T21:17:12.425-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1 time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06 time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540 time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06 time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=token_embd.weight type=q6_K shape="[5120 131072]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output.weight type=q2_K shape="[5120 131072]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.patch_conv.weight type=f16 shape="[14 14 3 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.encoder_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_1.weight type=f16 shape="[1024 5120]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_2.weight type=f16 shape="[5120 5120]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.patch_merger.merging_layer.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.580-04:00 level=INFO source=server.go:619 msg="llama runner started in 2.01 seconds" time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:464 msg="finished setting up runner" model=blob_file [GIN] 2025/04/07 - 21:17:12 | 200 | 2.4387179s | i.p.add.ress | POST "/api/generate" time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:468 msg="context for request finished" time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=blob_file duration=5m0s time=2025-04-07T21:17:12.581-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=blob_file refCount=0 time=2025-04-07T21:17:17.815-04:00 level=DEBUG source=sched.go:577 msg="evaluating already loaded" model=blob_file time=2025-04-07T21:17:17.854-04:00 level=DEBUG source=routes.go:1522 msg="chat request" images=0 prompt="[SYSTEM_PROMPT]You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYou power an AI assistant called Le Chat.\nYour knowledge base was last updated on 2023-10-01.\n\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \"What are some good restaurants around me?\" => \"Where are you?\" or \"When is the next flight to Tokyo\" => \"Where do you travel from?\").\nYou are always very attentive to dates, in particular you try to resolve dates (e.g. \"yesterday\" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.\nYou follow these instructions in all languages, and always respond to the user in the language they use or request.\nNext sections describe the capabilities that you have.\n\n# WEB BROWSING INSTRUCTIONS\n\nYou cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.\n\n# MULTI-MODAL INSTRUCTIONS\n\nYou have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.\nYou cannot read nor transcribe audio files or videos.[/SYSTEM_PROMPT][INST]tell a joke[/INST]" time=2025-04-07T21:17:17.889-04:00 level=DEBUG source=process_text.go:304 msg="adding bos token to prompt" id=1 time=2025-04-07T21:17:17.891-04:00 level=DEBUG source=cache.go:136 msg="loading cache slot" id=0 cache=0 prompt=362 used=0 remaining=362 [GIN] 2025/04/07 - 21:17:25 | 200 | 7.7953074s | i.p.add.ress | POST "/api/chat" time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:409 msg="context for request finished" time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=blob_file duration=5m0s time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=blob_file refCount=0 ```

GiteaMirror commented

2026-04-22 13:42:54 -05:00

@rick-github commented on GitHub (Apr 8, 2025):

time=2025-04-07T21:17:10.495-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1
 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B"
 memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB"
 memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB"
 memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB"
 projector.weights="769.3 MiB" projector.graph="8.8 GiB"

time=2025-04-07T21:17:10.569-04:00 level=INFO source=server.go:405 msg="starting llama server"
 cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner
 --ollama-engine
 --model pathto_model_root\\blobfile
 --ctx-size 4096
 --batch-size 512
 --verbose --threads 4 --no-mmap --parallel 1 --port 61075"

load_backend: loaded CUDA backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll

It found the GPU backend but num_gpu was unset, so it loaded the number of layers it calculated that would fit, ie zero.

@rick-github commented on GitHub (Apr 8, 2025): ``` time=2025-04-07T21:17:10.495-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" time=2025-04-07T21:17:10.569-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model pathto_model_root\\blobfile --ctx-size 4096 --batch-size 512 --verbose --threads 4 --no-mmap --parallel 1 --port 61075" load_backend: loaded CUDA backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll ``` It found the GPU backend but `num_gpu` was unset, so it loaded the number of layers it calculated that would fit, ie zero.

GiteaMirror commented

2026-04-22 13:42:56 -05:00

@lowlyocean commented on GitHub (Apr 8, 2025):

With the default context window of 2048, I was able to force num_gpu to 41 (all the layers) and it seems to load onto the GPU (despite ollama ps showing 100% CPU).

However, increasing context window to 8192 causes a crash:

ggml_backend_cuda_buffer_type_alloc_buffer: allocating 248.01 MiB on device 1: cudaMalloc failed: out of memory
ggml_gallocr_reserve_n: failed to allocate CUDA1 buffer of size 260055040

Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually?

@lowlyocean commented on GitHub (Apr 8, 2025): With the default context window of 2048, I was able to force num_gpu to 41 (all the layers) and it seems to load onto the GPU (despite ollama ps showing 100% CPU). However, increasing context window to 8192 causes a crash: ``` ggml_backend_cuda_buffer_type_alloc_buffer: allocating 248.01 MiB on device 1: cudaMalloc failed: out of memory ggml_gallocr_reserve_n: failed to allocate CUDA1 buffer of size 260055040 ``` Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually?

GiteaMirror commented

2026-04-22 13:42:56 -05:00

@arturo-air commented on GitHub (Apr 8, 2025):

Something similar is happening to me. My ollama version is 0.6.5, and I just pulled the mistral model ollama run mistral-small3.1 (hash b9aaf0c2586a).

I am using a 4090, and when I run the new mistral model, it does not allocate 100% on the GPU:

arturo@thinkpad:~$ nvidia-smi 
Tue Apr  8 05:45:51 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:01:00.0 Off |                  Off |
|  0%   40C    P0              69W / 450W |    112MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2020      G   /usr/lib/xorg/Xorg                           86MiB |
|    0   N/A  N/A      2361      G   /usr/bin/gnome-shell                         13MiB |
+---------------------------------------------------------------------------------------+
arturo@thinkpad:~$ ollama run mistral-small3.1:latest
>>> hello
Hello! How can I assist you today?

>>> 
arturo@thinkpad:~$ ollama ps
NAME                       ID              SIZE     PROCESSOR         UNTIL              
mistral-small3.1:latest    b9aaf0c2586a    26 GB    6%/94% CPU/GPU    4 minutes from now    
arturo@thinkpad:~$ nvidia-smi 
Tue Apr  8 05:46:43 2025       
.........
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2020      G   /usr/lib/xorg/Xorg                           86MiB |
|    0   N/A  N/A      2361      G   /usr/bin/gnome-shell                         13MiB |
|    0   N/A  N/A   1229081      C   /usr/local/bin/ollama                     13298MiB |
+---------------------------------------------------------------------------------------+

On the other hand, if I do the same operation with other model, like gemma3:27b, this error doesn't happen:

arturo@thinkpad:~$ ollama run gemma3:27b
>>> hello
Hello there! 👋 

How can I help you today? Just let me know what you're thinking, or if you just wanted to say hi, that's great too! 
.....
>>> 
arturo@thinkpad:~$ ollama ps
NAME          ID              SIZE     PROCESSOR    UNTIL              
gemma3:27b    30ddded7fba6    22 GB    100% GPU     4 minutes from now

So my guess is that ollama is estimating wrong the new mistral model size, since it also is smaller than the gemma one:

arturo@thinkpad:~$ ollama ls
NAME                          ID              SIZE      MODIFIED       
mistral-small3.1:latest       b9aaf0c2586a    15 GB     35 minutes ago    
gemma3:27b                    30ddded7fba6    17 GB     3 weeks ago

@arturo-air commented on GitHub (Apr 8, 2025): Something similar is happening to me. My ollama version is 0.6.5, and I just pulled the mistral model `ollama run mistral-small3.1` (hash b9aaf0c2586a). I am using a 4090, and when I run the new mistral model, it does not allocate 100% on the GPU: ``` arturo@thinkpad:~$ nvidia-smi Tue Apr 8 05:45:51 2025 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 Off | Off | | 0% 40C P0 69W / 450W | 112MiB / 24564MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 2020 G /usr/lib/xorg/Xorg 86MiB | | 0 N/A N/A 2361 G /usr/bin/gnome-shell 13MiB | +---------------------------------------------------------------------------------------+ arturo@thinkpad:~$ ollama run mistral-small3.1:latest >>> hello Hello! How can I assist you today? >>> arturo@thinkpad:~$ ollama ps NAME ID SIZE PROCESSOR UNTIL mistral-small3.1:latest b9aaf0c2586a 26 GB 6%/94% CPU/GPU 4 minutes from now arturo@thinkpad:~$ nvidia-smi Tue Apr 8 05:46:43 2025 ......... +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 2020 G /usr/lib/xorg/Xorg 86MiB | | 0 N/A N/A 2361 G /usr/bin/gnome-shell 13MiB | | 0 N/A N/A 1229081 C /usr/local/bin/ollama 13298MiB | +---------------------------------------------------------------------------------------+ ``` On the other hand, if I do the same operation with other model, like `gemma3:27b`, this error doesn't happen: ``` arturo@thinkpad:~$ ollama run gemma3:27b >>> hello Hello there! 👋 How can I help you today? Just let me know what you're thinking, or if you just wanted to say hi, that's great too! ..... >>> arturo@thinkpad:~$ ollama ps NAME ID SIZE PROCESSOR UNTIL gemma3:27b 30ddded7fba6 22 GB 100% GPU 4 minutes from now ``` So my guess is that ollama is estimating wrong the new mistral model size, since it also is smaller than the gemma one: ``` arturo@thinkpad:~$ ollama ls NAME ID SIZE MODIFIED mistral-small3.1:latest b9aaf0c2586a 15 GB 35 minutes ago gemma3:27b 30ddded7fba6 17 GB 3 weeks ago ```

GiteaMirror commented

2026-04-22 13:42:57 -05:00

@metal3d commented on GitHub (Apr 8, 2025):

I've got the same problem on one server with Deepseek-R1, the model is not filled inside the 4 Cards with 8Go VRAM (so 32Go available) while it works on my computer (same distribution, Fedora 41) with one card having 24Go VRAM.

@metal3d commented on GitHub (Apr 8, 2025): I've got the same problem on one server with Deepseek-R1, the model is not filled inside the 4 Cards with 8Go VRAM (so 32Go available) while it works on my computer (same distribution, Fedora 41) with one card having 24Go VRAM.

GiteaMirror commented

2026-04-22 13:43:00 -05:00

@vini-muchulski commented on GitHub (Apr 8, 2025):

try this! https://www.reddit.com/r/LocalLLaMA/comments/1judvfg/how_to_fix_slow_inference_speed_of_mistralsmall/

@vini-muchulski commented on GitHub (Apr 8, 2025): try this! https://www.reddit.com/r/LocalLLaMA/comments/1judvfg/how_to_fix_slow_inference_speed_of_mistralsmall/

GiteaMirror commented

2026-04-22 13:43:00 -05:00

@rick-github commented on GitHub (Apr 8, 2025):

Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually?

The underlying issue is that you don't have enough VRAM to run the model. Setting num_gpu and PATH tries to skirt the issue, but then you end up with OOMs.

So my guess is that ollama is estimating wrong the new mistral model size, since it also is smaller than the gemma one:

They are two different models with different architectures, context window, block count, etc. The memory estimation will not be the same.

the model is not filled inside the 4 Cards with 8Go VRAM (so 32Go available)

When a model is shared across multiple devices, the amount of overhead goes up. That is, a model loaded into a GPU consists of weights, context buffer, computation graph, projector data structures, etc. Some of those allocations need to be replicated across all devices, so multiple copies of those allocations increases the overall VRAM requirement.

@rick-github commented on GitHub (Apr 8, 2025): > Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually? The underlying issue is that you don't have enough VRAM to run the model. Setting `num_gpu` and PATH tries to skirt the issue, but then you end up with OOMs. > So my guess is that ollama is estimating wrong the new mistral model size, since it also is smaller than the gemma one: They are two different models with different architectures, context window, block count, etc. The memory estimation will not be the same. > the model is not filled inside the 4 Cards with 8Go VRAM (so 32Go available) When a model is shared across multiple devices, the amount of overhead goes up. That is, a model loaded into a GPU consists of weights, context buffer, computation graph, projector data structures, etc. Some of those allocations need to be replicated across all devices, so multiple copies of those allocations increases the overall VRAM requirement.

GiteaMirror commented

2026-04-22 13:43:00 -05:00

@rick-github commented on GitHub (Apr 8, 2025):

See here for ways to deal with OOMs.

@rick-github commented on GitHub (Apr 8, 2025): See [here](https://github.com/ollama/ollama/issues/8597#issuecomment-2614533288) for ways to deal with OOMs.

GiteaMirror commented

2026-04-22 13:43:01 -05:00

@lowlyocean commented on GitHub (Apr 8, 2025):

Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually?

The underlying issue is that you don't have enough VRAM to run the model. Setting num_gpu and PATH tries to skirt the issue, but then you end up with OOMs.

I meant specifically the issue that causes it to think that 0 layers can be fit on the GPU even though the GPU is bigger than the entire model

@lowlyocean commented on GitHub (Apr 8, 2025): > > Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually? > > The underlying issue is that you don't have enough VRAM to run the model. Setting `num_gpu` and PATH tries to skirt the issue, but then you end up with OOMs. I meant specifically the issue that causes it to think that 0 layers can be fit on the GPU even though the GPU is bigger than the entire model

GiteaMirror commented

2026-04-22 13:43:02 -05:00

@rick-github commented on GitHub (Apr 8, 2025):

The memory estimation logic is receiving attention and there's always room for improvement, but fundamentally you don't have enough VRAM to host the model with the default parameters. There are ways to tweak those parameters as a linked above, but as I mentioned about splitting the model across multiple devices, the extra overhead involved with your setup is just going to make it hard to host mistral-small.

@rick-github commented on GitHub (Apr 8, 2025): The memory estimation logic is receiving attention and there's always room for improvement, but fundamentally you don't have enough VRAM to host the model with the default parameters. There are ways to tweak those parameters as a linked above, but as I mentioned about splitting the model across multiple devices, the extra overhead involved with your setup is just going to make it hard to host mistral-small.

GiteaMirror commented

2026-04-22 13:43:02 -05:00

@lowlyocean commented on GitHub (Apr 8, 2025):

The memory estimation logic is receiving attention and there's always room for improvement, but fundamentally you don't have enough VRAM to host the model with the default parameters. There are ways to tweak those parameters as a linked above, but as I mentioned about splitting the model across multiple devices, the extra overhead involved with your setup is just going to make it hard to host mistral-small.

Thanks, I want to point out that when I was running the Bartowski Q2_K quant of Mistral Small 3.1 from HF
ollama pull hf.co/bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF:Q2_K
it is able to work with context of 8192, flash attention on, and KV cache q8_0 with ollama ps showing that it automatically fit entirely into GPU without needing to add libraries to a PATH or setting num_gpu manually.

To be clear: there seems to be something specific about having quantized the official Ollama version that's causing this edge case.

@lowlyocean commented on GitHub (Apr 8, 2025): > The memory estimation logic is receiving attention and there's always room for improvement, but fundamentally you don't have enough VRAM to host the model with the default parameters. There are ways to tweak those parameters as a linked above, but as I mentioned about splitting the model across multiple devices, the extra overhead involved with your setup is just going to make it hard to host mistral-small. Thanks, I want to point out that when I was running the Bartowski Q2_K quant of Mistral Small 3.1 from HF `ollama pull hf.co/bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF:Q2_K` it is able to work with context of 8192, flash attention on, and KV cache q8_0 with `ollama ps` showing that it automatically fit entirely into GPU without needing to add libraries to a PATH or setting num_gpu manually. To be clear: there seems to be something specific about having quantized the official Ollama version that's causing this edge case.

GiteaMirror commented

2026-04-22 13:43:03 -05:00

@rick-github commented on GitHub (Apr 8, 2025):

The bartowski model doesn't do vision. The projector graph for vision support is large.

@rick-github commented on GitHub (Apr 8, 2025): The bartowski model [doesn't do vision](https://huggingface.co/bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF/discussions/1). The projector graph for vision support is [large](https://github.com/ollama/ollama/issues/10167#issuecomment-2784917541).

GiteaMirror commented

2026-04-22 13:43:03 -05:00

@lowlyocean commented on GitHub (Apr 8, 2025):

Thanks, understood. So then, for now if I set num_gpu manually to the max amount (41) I seem to get tokens/s nearly on par with the Bartowski model, and no OOM failures. Future visitors to this issue may want to consider doing that as a workaround

@lowlyocean commented on GitHub (Apr 8, 2025): Thanks, understood. So then, for now if I set num_gpu manually to the max amount (41) I seem to get tokens/s nearly on par with the Bartowski model, and no OOM failures. Future visitors to this issue may want to consider doing that as a workaround

GiteaMirror commented

2026-04-22 13:43:04 -05:00

@orrinwitt commented on GitHub (Apr 9, 2025):

I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's available with a combined ~28gb vram (12+8+8), and nvidia-smi shows literally none of the model being loaded to any of them. eventually I get output, since i have enough system ram, but that's obviously unusable output speed.

@orrinwitt commented on GitHub (Apr 9, 2025): I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's available with a combined ~28gb vram (12+8+8), and nvidia-smi shows literally none of the model being loaded to any of them. eventually I get output, since i have enough system ram, but that's obviously unusable output speed.

GiteaMirror commented

2026-04-22 13:43:04 -05:00

@metal3d commented on GitHub (Apr 9, 2025):

I had the same... And finally found that reducing context window size to
8096 made everything OK on multiple GPU.

Le mer. 9 avr. 2025, 17:38, Orrin Witt @.***> a écrit :

I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's
available with a combined ~28gb vram (12+8+8), and nvidia-smi shows
literally none of the model being loaded to any of them. eventually I get
output, since i have enough system ram, but that's obviously unusable
output speed.

—
Reply to this email directly, view it on GitHub
https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAYN4CM2P44K7QV24N3YS32YU5GVAVCNFSM6AAAAAB2UNNTOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGE2DMNJSHE
.
You are receiving this because you commented.Message ID:
@.***>
orrinwitt left a comment (ollama/ollama#10167)
https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529

I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's
available with a combined ~28gb vram (12+8+8), and nvidia-smi shows
literally none of the model being loaded to any of them. eventually I get
output, since i have enough system ram, but that's obviously unusable
output speed.

—
Reply to this email directly, view it on GitHub
https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAYN4CM2P44K7QV24N3YS32YU5GVAVCNFSM6AAAAAB2UNNTOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGE2DMNJSHE
.
You are receiving this because you commented.Message ID:
@.***>

@metal3d commented on GitHub (Apr 9, 2025): I had the same... And finally found that reducing context window size to 8096 made everything OK on multiple GPU. Le mer. 9 avr. 2025, 17:38, Orrin Witt ***@***.***> a écrit : > I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's > available with a combined ~28gb vram (12+8+8), and nvidia-smi shows > literally none of the model being loaded to any of them. eventually I get > output, since i have enough system ram, but that's obviously unusable > output speed. > > — > Reply to this email directly, view it on GitHub > <https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAAYN4CM2P44K7QV24N3YS32YU5GVAVCNFSM6AAAAAB2UNNTOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGE2DMNJSHE> > . > You are receiving this because you commented.Message ID: > ***@***.***> > *orrinwitt* left a comment (ollama/ollama#10167) > <https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529> > > I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's > available with a combined ~28gb vram (12+8+8), and nvidia-smi shows > literally none of the model being loaded to any of them. eventually I get > output, since i have enough system ram, but that's obviously unusable > output speed. > > — > Reply to this email directly, view it on GitHub > <https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAAYN4CM2P44K7QV24N3YS32YU5GVAVCNFSM6AAAAAB2UNNTOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGE2DMNJSHE> > . > You are receiving this because you commented.Message ID: > ***@***.***> >

GiteaMirror commented

2026-04-22 13:43:06 -05:00

@orrinwitt commented on GitHub (Apr 18, 2025):

This behavior remains unchanged in 0.6.6-rc2

@orrinwitt commented on GitHub (Apr 18, 2025): This behavior remains unchanged in 0.6.6-rc2

GiteaMirror commented

2026-04-22 13:43:07 -05:00

@thot-experiment commented on GitHub (Apr 25, 2025):

I'm also getting this issue, 32gb GPU, cannot load mistral 3.1 q6k fully into vram, if I let ollama see my second GPU the model gets split taking a total of 24gb. Quanting the same model via ollama down to q5ks allows me to fit it in one gpu, but ollama ps reports 31gb vram used but nvidia-smi and task manager both agree that only 22gb of vram is used (including by other applications). Ram usage was tested during vision inference and the the model outputs confirm that the vision tower is working correctly.

This is a breaking bug, it prevents people from loading models that would fit into vram for no reason other than bad usage estimation. I'm using num_gpu 41 btw.

@thot-experiment commented on GitHub (Apr 25, 2025): I'm also getting this issue, 32gb GPU, cannot load mistral 3.1 q6k fully into vram, if I let ollama see my second GPU the model gets split taking a total of 24gb. Quanting the same model via ollama down to q5ks allows me to fit it in one gpu, but `ollama ps` reports 31gb vram used but nvidia-smi and task manager both agree that only 22gb of vram is used (including by other applications). Ram usage was tested during vision inference and the the model outputs confirm that the vision tower is working correctly. This is a breaking bug, it prevents people from loading models that would fit into vram for no reason other than bad usage estimation. I'm using num_gpu 41 btw.

GiteaMirror commented

2026-04-22 13:43:08 -05:00

@fbroussais commented on GitHub (May 11, 2025):

I have the same issue with RTX4060ti 16GB : ~3Gb in VRAM and ~11GB in CPU RAM with mistral-small3.1:latest (14GB)
ollama 0.6.8 on Win11
Other models with same size fits more in VRAM (90% for 12GB mistral-nemo:12b-instruct-2407-q8_0)

@fbroussais commented on GitHub (May 11, 2025): I have the same issue with RTX4060ti 16GB : ~3Gb in VRAM and ~11GB in CPU RAM with mistral-small3.1:latest (14GB) ollama 0.6.8 on Win11 Other models with same size fits more in VRAM (90% for 12GB mistral-nemo:12b-instruct-2407-q8_0)

Sign in to join this conversation.

Branches Tags

main

parth-update-hermes-launch

parth-agent-system-prompt-cwd

hoyyeva/vscode-extension-docs-update

parth-gemma4-chat-template-renderer

parth-fix-claude-model-picker

parth-api-status-context-length

docs/vscode-extension-setup

hoyyeva/wire-up-context-length

hoyyeva/claude-code-context-doc

jmorganca/investigate-issue-17046

hoyyeva/hermes-docs

jmorganca/agent-loop-style

hoyyeva/openclaw

parth-agent-loop

hoyyeva/ollama-vscode-extension

brucemacd/cache-metrics

brucemacd/hermes-desktop

hoyyeva/docs-vscode

parth-input-style-experiment

brucemacd/docs-glm52

hoyyeva/poc-docs

Parth/mlx-launch-recommendations

parth-first-time-app-cli-experience

test/darwin-xcode-pin

improve-cloud-model-recommendations

hoyyeva/goose-docs

jmorganca/context-limit-fixes

hoyyeva/qwen-doc

hoyyeva/vscode-docs

jmorganca/remove-mlx-imagegen-code

parth-copilot-token-length-defaults

hoyyeva/poolside-windows

laguna-support

jmorganca/harden-markdown-rendering

laguna-renderer-parser

laguna-llamacpp

codex/make-integration-hidden-and-lunchable

brucemacd/omp-docs

pdevine/gguf-mtp-oldstyle

hoyyeva/migrate-pi

hoyyeva/anthropic-local-image-path

parth-launch-codex-app

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth/hide-claude-desktop-till-release

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#32432