[GH-ISSUE #10167] Quantized Mistral small 3.1 doesn't utilize NVIDIA GPUs #32432

Closed
opened 2026-04-22 13:42:34 -05:00 by GiteaMirror · 36 comments
Owner

Originally created by @lowlyocean on GitHub (Apr 7, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10167

What is the issue?

  1. Pulled the new f16 Mistral Small 3.1
  2. Created new Modelfile containing only the line FROM mistral-small3.1:24b-instruct-2503-fp16
  3. Ran the following command to create Q2_K quant:
    ollama create -q q2_k mistral-small3.1:24b-instruct-2503-q2_k
  4. Run the model , and then check ollama ps
  5. Notice that it's 100% loaded into CPU (instead of using the 12GB + 3GB of VRAM from two NVIDIA GPUs)

Log keeps saying there is not enough VRAM to allocate any layers, but the entire quantized model is only 10GB

Relevant log output

time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.9 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.9 GiB 2.4 GiB]"
time=2025-04-07T15:19:33.123-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="51.1 GiB" before.free_swap="44.0 GiB" now.total="63.9 GiB" now.free="51.1 GiB" now.free_swap="44.0 GiB"
time=2025-04-07T15:19:33.136-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.9 GiB" now.total="12.0 GiB" now.free="9.9 GiB" now.used="2.1 GiB"
time=2025-04-07T15:19:33.152-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.9 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="51.1 GiB" before.free_swap="44.0 GiB" now.total="63.9 GiB" now.free="51.1 GiB" now.free_swap="44.0 GiB"
time=2025-04-07T15:19:33.167-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.9 GiB" now.total="12.0 GiB" now.free="9.9 GiB" now.used="2.1 GiB"
time=2025-04-07T15:19:33.183-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.6.5

Originally created by @lowlyocean on GitHub (Apr 7, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10167 ### What is the issue? 1. Pulled the new [f16 Mistral Small 3.1](https://ollama.com/library/mistral-small3.1:24b-instruct-2503-fp16) 2. Created new Modelfile containing only the line `FROM mistral-small3.1:24b-instruct-2503-fp16` 3. Ran the following command to create Q2_K quant: `ollama create -q q2_k mistral-small3.1:24b-instruct-2503-q2_k` 4. Run the model , and then check `ollama ps` 5. Notice that it's 100% loaded into CPU (instead of using the 12GB + 3GB of VRAM from two NVIDIA GPUs) Log keeps saying there is not enough VRAM to allocate any layers, but the entire quantized model is only 10GB ### Relevant log output ```shell time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.9 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T15:19:33.122-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.9 GiB 2.4 GiB]" time=2025-04-07T15:19:33.123-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="51.1 GiB" before.free_swap="44.0 GiB" now.total="63.9 GiB" now.free="51.1 GiB" now.free_swap="44.0 GiB" time=2025-04-07T15:19:33.136-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.9 GiB" now.total="12.0 GiB" now.free="9.9 GiB" now.used="2.1 GiB" time=2025-04-07T15:19:33.152-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.9 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T15:19:33.153-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="51.1 GiB" before.free_swap="44.0 GiB" now.total="63.9 GiB" now.free="51.1 GiB" now.free_swap="44.0 GiB" time=2025-04-07T15:19:33.167-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.9 GiB" now.total="12.0 GiB" now.free="9.9 GiB" now.used="2.1 GiB" time=2025-04-07T15:19:33.183-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library ``` ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.6.5
GiteaMirror added the bug label 2026-04-22 13:42:35 -05:00
Author
Owner

@btibor91 commented on GitHub (Apr 7, 2025):

I am experiencing the same problem with mistral-small3.1:24b (24b-instruct-2503-q4_K_M) - size is 15 GB, and with 20 GB of VRAM, it gets split into 8 GB and 7 GB between VRAM and RAM.

ollama 0.6.5 / Ubuntu 22.04

ollama ps shows size 26 GB, but ollama ls only 15 GB

NVIDIA RTX 4000 SFF Ada (20GB VRAM)

source=server.go:105 msg="system memory" total="62.6 GiB" free="48.9 GiB" free_swap="26.3 GiB"

source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=24 layers.split="" memory.available="[18.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.7 GiB" memory.required.partial="18.7 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[18.7 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"

source=ggml.go:289 msg="model weights" buffer=CPU size="6.8 GiB"

source=ggml.go:289 msg="model weights" buffer=CUDA0 size="7.6 GiB"
<!-- gh-comment-id:2784435704 --> @btibor91 commented on GitHub (Apr 7, 2025): I am experiencing the same problem with `mistral-small3.1:24b` (24b-instruct-2503-q4_K_M) - size is 15 GB, and with 20 GB of VRAM, it gets split into 8 GB and 7 GB between VRAM and RAM. ollama 0.6.5 / Ubuntu 22.04 `ollama ps` shows size 26 GB, but `ollama ls` only 15 GB NVIDIA RTX 4000 SFF Ada (20GB VRAM) ``` source=server.go:105 msg="system memory" total="62.6 GiB" free="48.9 GiB" free_swap="26.3 GiB" source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=24 layers.split="" memory.available="[18.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.7 GiB" memory.required.partial="18.7 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[18.7 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" source=ggml.go:289 msg="model weights" buffer=CPU size="6.8 GiB" source=ggml.go:289 msg="model weights" buffer=CUDA0 size="7.6 GiB" ```
Author
Owner

@rick-github commented on GitHub (Apr 7, 2025):

time=2025-04-07T19:58:31.332Z level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1
 layers.model=41 layers.offload=14 layers.split="" memory.available="[15.6 GiB]" memory.gpu_overhead="0 B"
 memory.required.full="25.0 GiB" memory.required.partial="15.4 GiB" memory.required.kv="640.0 MiB"
 memory.required.allocations="[15.4 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB"
 memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB"
 projector.weights="769.3 MiB" projector.graph="8.8 GiB"

It looks like ollama is wildly over-estimating the VRAM required. nvidia-smi shows the backend only allocated 5.3G where ollama estimated 15.4G.

<!-- gh-comment-id:2784515492 --> @rick-github commented on GitHub (Apr 7, 2025): ``` time=2025-04-07T19:58:31.332Z level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=14 layers.split="" memory.available="[15.6 GiB]" memory.gpu_overhead="0 B" memory.required.full="25.0 GiB" memory.required.partial="15.4 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[15.4 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" ``` It looks like ollama is wildly over-estimating the VRAM required. `nvidia-smi` shows the backend only allocated 5.3G where ollama estimated 15.4G.
Author
Owner

@btibor91 commented on GitHub (Apr 7, 2025):

Possibly related to #10128

<!-- gh-comment-id:2784531144 --> @btibor91 commented on GitHub (Apr 7, 2025): Possibly related to #10128
Author
Owner

@rick-github commented on GitHub (Apr 7, 2025):

No flash attention so not directly related to #10128. But ollama has always had issues with correct estimations, it's just gotten worse with the new go-based runner - gemma3 has the same problem (#9791, #10040)

<!-- gh-comment-id:2784538510 --> @rick-github commented on GitHub (Apr 7, 2025): No flash attention so not directly related to #10128. But ollama has always had issues with correct estimations, it's just gotten worse with the new go-based runner - gemma3 has the same problem (#9791, #10040)
Author
Owner

@jessegross commented on GitHub (Apr 7, 2025):

https://github.com/ollama/ollama/issues/9791#issuecomment-2755958292

<!-- gh-comment-id:2784542713 --> @jessegross commented on GitHub (Apr 7, 2025): https://github.com/ollama/ollama/issues/9791#issuecomment-2755958292
Author
Owner

@lowlyocean commented on GitHub (Apr 7, 2025):

Is this a regression worthy of a rollback until the issue with the new runner gets sorted out?

It seems quite severe for a 10GB model to have no layers at all allocated to 15GB available VRAM

<!-- gh-comment-id:2784577381 --> @lowlyocean commented on GitHub (Apr 7, 2025): Is this a regression worthy of a rollback until the issue with the new runner gets sorted out? It seems quite severe for a 10GB model to have no layers **_at all_** allocated to 15GB available VRAM
Author
Owner

@maxi1134 commented on GitHub (Apr 7, 2025):

Anyone knows how i can force the offload?

I tried setting num_gpu to a wildly large number ( 170) for mistral small 3.1 in attempts to get it to run, but it only offloads 16gb out of the 24 available on my 3090.

<!-- gh-comment-id:2784629696 --> @maxi1134 commented on GitHub (Apr 7, 2025): Anyone knows how i can force the offload? I tried setting num_gpu to a wildly large number ( 170) for mistral small 3.1 in attempts to get it to run, but it only offloads 16gb out of the 24 available on my 3090.
Author
Owner

@rick-github commented on GitHub (Apr 7, 2025):

16G is the full model, it's fully offloaded.

<!-- gh-comment-id:2784700998 --> @rick-github commented on GitHub (Apr 7, 2025): 16G is the full model, it's fully offloaded.
Author
Owner

@lowlyocean commented on GitHub (Apr 7, 2025):

I just tried setting PARAMETER num_gpu 32 by adding to the Modelfile dumped from the Q2_K quant and regenerating the model from it. Still seeing 100% CPU use

<!-- gh-comment-id:2784777421 --> @lowlyocean commented on GitHub (Apr 7, 2025): I just tried setting `PARAMETER num_gpu 32` by adding to the Modelfile dumped from the Q2_K quant and regenerating the model from it. Still seeing 100% CPU use
Author
Owner

@rick-github commented on GitHub (Apr 7, 2025):

I suspect you are hitting a corner case. Normally, ollama will compute that at least one layer will fit, and will list the GPU backends in the list of backends to consider when the runner loads the model. This is where you can override it by setting num_gpu, the runner loads a GPU backend and then num_gpu kicks in. In your case, ollama has decided that no way will it be able to load a layer into the GPU, so the GPU backends are not included in the list of backends to choose from. You can hack around this by adding the path to the GPU library directory to the PATH environment variable in the server.

<!-- gh-comment-id:2784789807 --> @rick-github commented on GitHub (Apr 7, 2025): I suspect you are hitting a corner case. Normally, ollama will compute that at least one layer will fit, and will list the GPU backends in the list of backends to consider when the runner loads the model. This is where you can override it by setting `num_gpu`, the runner loads a GPU backend and then `num_gpu` kicks in. In your case, ollama has decided that no way will it be able to load a layer into the GPU, so the GPU backends are not included in the list of backends to choose from. You can hack around this by adding the path to the GPU library directory to the PATH environment variable in the server.
Author
Owner

@rick-github commented on GitHub (Apr 7, 2025):

Although, having said that, I see that you have 9.9G available on one of your cards. I think that ollama should be able to load at least one layer there, so maybe my guess above is incorrect. If you can supply server logs it may be easier to diagnose.

<!-- gh-comment-id:2784797464 --> @rick-github commented on GitHub (Apr 7, 2025): Although, having said that, I see that you have 9.9G available on one of your cards. I think that ollama should be able to load at least one layer there, so maybe my guess above is incorrect. If you can supply server logs it may be easier to diagnose.
Author
Owner

@lowlyocean commented on GitHub (Apr 7, 2025):

no way will it be able to load a layer into the GPU

Any part of the log that can confirm if this is happening? Because I have other (larger) models than this 10GB quant which get loaded fully onto the GPUs. Even with this latest release (0.6.5) of ollama. So that also rules out failing to find the GPU libraries.

Could the quantization to Q2_K somehow be making the model a single massive layer?

<!-- gh-comment-id:2784801102 --> @lowlyocean commented on GitHub (Apr 7, 2025): > no way will it be able to load a layer into the GPU Any part of the log that can confirm if this is happening? Because I have other (larger) models than this 10GB quant which get loaded fully onto the GPUs. Even with this latest release (0.6.5) of ollama. So that also rules out failing to find the GPU libraries. Could the quantization to Q2_K somehow be making the model a single massive layer?
Author
Owner

@maxi1134 commented on GitHub (Apr 7, 2025):

16G is the full model, it's fully offloaded.

Odd, it still shows some CPU usage in ollama ps

Image

<!-- gh-comment-id:2784818216 --> @maxi1134 commented on GitHub (Apr 7, 2025): > 16G is the full model, it's fully offloaded. Odd, it still shows some CPU usage in `ollama ps` ![Image](https://github.com/user-attachments/assets/54c28bdc-6baf-4749-b24b-58c38e9a72f1)
Author
Owner

@lowlyocean commented on GitHub (Apr 7, 2025):

Although, having said that, I see that you have 9.9G available on one of your cards. I think that ollama should be able to load at least one layer there, so maybe my guess above is incorrect. If you can supply server logs it may be easier to diagnose.

2025/04/07 19:19:06 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://i.p.add.ress:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:modelDir OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]"
time=2025-04-07T19:19:06.150-04:00 level=INFO source=images.go:458 msg="total blobs: 28"
time=2025-04-07T19:19:06.152-04:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0"
time=2025-04-07T19:19:06.153-04:00 level=INFO source=routes.go:1298 msg="Listening on i.p.add.ress:11434 (version 0.6.5)"
time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=sched.go:107 msg="starting llm scheduler"
time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=4 efficiency=0 threads=8
time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-04-07T19:19:06.154-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="PATHs"
time=2025-04-07T19:19:06.155-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvml.dll"
time=2025-04-07T19:19:06.156-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-04-07T19:19:06.175-04:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\Windows\system32\nvml.dll
time=2025-04-07T19:19:06.177-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-04-07T19:19:06.177-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="PATHs"
initializing C:\Windows\system32\nvcuda.dll
dlsym: cuInit - address
dlsym: cuDriverGetVersion - address
dlsym: cuDeviceGetCount - address
dlsym: cuDeviceGet - address
dlsym: cuDeviceGetAttribute - address
dlsym: cuDeviceGetUuid - address
dlsym: cuDeviceGetName - address
dlsym: cuCtxCreate_v3 - address
dlsym: cuMemGetInfo_v2 - address
dlsym: cuCtxDestroy - address
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 2
time=2025-04-07T19:19:06.198-04:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=2 library=C:\Windows\system32\nvcuda.dll
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA totalMem 12287 mb
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA freeMem 11247 mb
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] Compute Capability 8.6
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA totalMem 3071 mb
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA freeMem 2462 mb
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] Compute Capability 6.1
time=2025-04-07T19:19:06.376-04:00 level=INFO source=gpu.go:319 msg="detected OS VRAM overhead" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB"
time=2025-04-07T19:19:06.379-04:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
releasing cuda driver library
releasing nvml library
time=2025-04-07T19:19:06.380-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="11.0 GiB"
time=2025-04-07T19:19:06.380-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB"
[GIN] 2025/04/07 - 19:19:15 | 200 |            0s |     i.p.add.ress | HEAD     "/"
[GIN] 2025/04/07 - 19:19:15 | 200 |     51.6352ms |     i.p.add.ress | POST     "/api/show"
time=2025-04-07T19:19:15.940-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="48.0 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:15.951-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="11.0 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:15.967-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:15.968-04:00 level=DEBUG source=sched.go:183 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=6 gpu_count=2
time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=sched.go:226 msg="loading first model" model=model_blob_file
time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]"
time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.027-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.042-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]"
time=2025-04-07T19:19:16.045-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.058-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.074-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]"
time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.089-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.104-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]"
time=2025-04-07T19:19:16.106-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.119-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.135-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.135-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.150-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.165-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.166-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T19:19:16.166-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.181-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.212-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.228-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.229-04:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="47.9 GiB" free_swap="42.4 GiB"
time=2025-04-07T19:19:16.229-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T19:19:16.229-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB"
time=2025-04-07T19:19:16.243-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T19:19:16.259-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T19:19:16.260-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=32 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"
time=2025-04-07T19:19:16.260-04:00 level=WARN source=server.go:173 msg="flash attention enabled but not supported by gpu"
time=2025-04-07T19:19:16.260-04:00 level=WARN source=server.go:196 msg="quantized kv cache requested but flash attention disabled" type=q8_0
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[]
time=2025-04-07T19:19:16.314-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1
time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06
time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540
time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06
time=2025-04-07T19:19:16.324-04:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu"
time=2025-04-07T19:19:16.324-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="%LocalAppData%\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model modelDir\\model_blob_file --ctx-size 4096 --batch-size 512 --n-gpu-layers 32 --verbose --threads 4 --no-mmap --parallel 1 --port 54638"
time=2025-04-07T19:19:16.324-04:00 level=DEBUG source=server.go:423 msg=subprocess environment="env variables"
time=2025-04-07T19:19:16.329-04:00 level=INFO source=sched.go:451 msg="loaded runners" count=1
time=2025-04-07T19:19:16.329-04:00 level=INFO source=server.go:580 msg="waiting for llama runner to start responding"
time=2025-04-07T19:19:16.329-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server error"
time=2025-04-07T19:19:16.354-04:00 level=INFO source=runner.go:816 msg="starting ollama engine"
time=2025-04-07T19:19:16.355-04:00 level=INFO source=runner.go:879 msg="Server listening on 127.0.0.1:54638"
time=2025-04-07T19:19:16.408-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.name default=""
time=2025-04-07T19:19:16.408-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.description default=""
time=2025-04-07T19:19:16.408-04:00 level=INFO source=ggml.go:67 msg="" architecture=mistral3 file_type=Q2_K name="" description="" num_tensors=585 num_key_values=43
time=2025-04-07T19:19:16.408-04:00 level=DEBUG source=ggml.go:93 msg="skipping path which is not part of ollama" path=C:\Python311\Scripts
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 55
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 20
ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0
load_backend: loaded CPU backend from userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll
time=2025-04-07T19:19:16.437-04:00 level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(clang)
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_1.weight shape="[1024 5120]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_2.weight shape="[5120 5120]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.patch_merger.merging_layer.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output.weight shape="[5120 131072]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=token_embd.weight shape="[5120 131072]" dtype=14 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.encoder_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.patch_conv.weight shape="[14 14 3 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T19:19:16.443-04:00 level=INFO source=ggml.go:289 msg="model weights" buffer=CPU size="9.4 GiB"
time=2025-04-07T19:19:16.582-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model"
time=2025-04-07T19:19:16.582-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.09"
time=2025-04-07T19:19:16.833-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.25"
time=2025-04-07T19:19:17.084-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.43"
time=2025-04-07T19:19:17.335-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.62"
time=2025-04-07T19:19:17.587-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.81"
time=2025-04-07T19:19:17.838-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.99"
time=2025-04-07T19:19:17.866-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CPU buffer_type=CPU
time=2025-04-07T19:19:17.867-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1
time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06
time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540
time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=token_embd.weight type=q6_K shape="[5120 131072]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output_norm.weight type=f32 shape=[5120]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output.weight type=q2_K shape="[5120 131072]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.patch_conv.weight type=f16 shape="[14 14 3 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.encoder_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.norm.weight type=f32 shape=[1024]
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_1.weight type=f16 shape="[1024 5120]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_2.weight type=f16 shape="[5120 5120]"
time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.patch_merger.merging_layer.weight type=f16 shape="[4096 1024]"
time=2025-04-07T19:19:18.088-04:00 level=INFO source=server.go:619 msg="llama runner started in 1.76 seconds"
time=2025-04-07T19:19:18.088-04:00 level=DEBUG source=sched.go:464 msg="finished setting up runner" model=model_blob_file
[GIN] 2025/04/07 - 19:19:18 | 200 |    2.1745499s |     i.p.add.ress | POST     "/api/generate"
time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:468 msg="context for request finished"
time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=model_blob_file duration=5m0s
time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=model_blob_file refCount=0
time=2025-04-07T19:19:22.300-04:00 level=DEBUG source=sched.go:577 msg="evaluating already loaded" model=model_blob_file
time=2025-04-07T19:19:22.330-04:00 level=DEBUG source=routes.go:1522 msg="chat request" images=0 prompt="[SYSTEM_PROMPT]You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYou power an AI assistant called Le Chat.\nYour knowledge base was last updated on 2023-10-01.\n\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \"What are some good restaurants around me?\" => \"Where are you?\" or \"When is the next flight to Tokyo\" => \"Where do you travel from?\").\nYou are always very attentive to dates, in particular you try to resolve dates (e.g. \"yesterday\" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.\nYou follow these instructions in all languages, and always respond to the user in the language they use or request.\nNext sections describe the capabilities that you have.\n\n# WEB BROWSING INSTRUCTIONS\n\nYou cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.\n\n# MULTI-MODAL INSTRUCTIONS\n\nYou have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.\nYou cannot read nor transcribe audio files or videos.[/SYSTEM_PROMPT][INST]tell a joke[/INST]"
time=2025-04-07T19:19:22.365-04:00 level=DEBUG source=process_text.go:304 msg="adding bos token to prompt" id=1
time=2025-04-07T19:19:22.365-04:00 level=DEBUG source=cache.go:136 msg="loading cache slot" id=0 cache=0 prompt=362 used=0 remaining=362
[GIN] 2025/04/07 - 19:20:49 | 200 |         1m27s |     i.p.add.ress | POST     "/api/chat"
time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:409 msg="context for request finished"
time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=model_blob_file duration=5m0s
time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=model_blob_file refCount=0
<!-- gh-comment-id:2784824526 --> @lowlyocean commented on GitHub (Apr 7, 2025): > Although, having said that, I see that you have 9.9G available on one of your cards. I think that ollama should be able to load at least one layer there, so maybe my guess above is incorrect. If you can supply server logs it may be easier to diagnose. ``` 2025/04/07 19:19:06 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://i.p.add.ress:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:modelDir OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]" time=2025-04-07T19:19:06.150-04:00 level=INFO source=images.go:458 msg="total blobs: 28" time=2025-04-07T19:19:06.152-04:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0" time=2025-04-07T19:19:06.153-04:00 level=INFO source=routes.go:1298 msg="Listening on i.p.add.ress:11434 (version 0.6.5)" time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=sched.go:107 msg="starting llm scheduler" time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-04-07T19:19:06.153-04:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=4 efficiency=0 threads=8 time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-04-07T19:19:06.153-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-04-07T19:19:06.154-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="PATHs" time=2025-04-07T19:19:06.155-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvml.dll" time=2025-04-07T19:19:06.156-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-04-07T19:19:06.175-04:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\Windows\system32\nvml.dll time=2025-04-07T19:19:06.177-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-04-07T19:19:06.177-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="PATHs" initializing C:\Windows\system32\nvcuda.dll dlsym: cuInit - address dlsym: cuDriverGetVersion - address dlsym: cuDeviceGetCount - address dlsym: cuDeviceGet - address dlsym: cuDeviceGetAttribute - address dlsym: cuDeviceGetUuid - address dlsym: cuDeviceGetName - address dlsym: cuCtxCreate_v3 - address dlsym: cuMemGetInfo_v2 - address dlsym: cuCtxDestroy - address calling cuInit calling cuDriverGetVersion raw version 0x2f30 CUDA driver version: 12.8 calling cuDeviceGetCount device count 2 time=2025-04-07T19:19:06.198-04:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=2 library=C:\Windows\system32\nvcuda.dll [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA totalMem 12287 mb [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA freeMem 11247 mb [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] Compute Capability 8.6 [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA totalMem 3071 mb [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA freeMem 2462 mb [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] Compute Capability 6.1 time=2025-04-07T19:19:06.376-04:00 level=INFO source=gpu.go:319 msg="detected OS VRAM overhead" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" time=2025-04-07T19:19:06.379-04:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." releasing cuda driver library releasing nvml library time=2025-04-07T19:19:06.380-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="11.0 GiB" time=2025-04-07T19:19:06.380-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" [GIN] 2025/04/07 - 19:19:15 | 200 | 0s | i.p.add.ress | HEAD "/" [GIN] 2025/04/07 - 19:19:15 | 200 | 51.6352ms | i.p.add.ress | POST "/api/show" time=2025-04-07T19:19:15.940-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="48.0 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:15.951-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="11.0 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:15.967-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:15.968-04:00 level=DEBUG source=sched.go:183 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=6 gpu_count=2 time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=sched.go:226 msg="loading first model" model=model_blob_file time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]" time=2025-04-07T19:19:16.013-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.027-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.042-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.044-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]" time=2025-04-07T19:19:16.045-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.058-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.074-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]" time=2025-04-07T19:19:16.075-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.089-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.104-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.105-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]" time=2025-04-07T19:19:16.106-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.119-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.135-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.135-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T19:19:16.136-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.150-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.165-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.166-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T19:19:16.166-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T19:19:16.167-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.181-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.197-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.212-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.228-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.229-04:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="47.9 GiB" free_swap="42.4 GiB" time=2025-04-07T19:19:16.229-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T19:19:16.229-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="47.9 GiB" before.free_swap="42.4 GiB" now.total="63.9 GiB" now.free="47.9 GiB" now.free_swap="42.4 GiB" time=2025-04-07T19:19:16.243-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T19:19:16.259-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T19:19:16.260-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=32 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" time=2025-04-07T19:19:16.260-04:00 level=WARN source=server.go:173 msg="flash attention enabled but not supported by gpu" time=2025-04-07T19:19:16.260-04:00 level=WARN source=server.go:196 msg="quantized kv cache requested but flash attention disabled" type=q8_0 time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[] time=2025-04-07T19:19:16.314-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1 time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06 time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540 time=2025-04-07T19:19:16.316-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06 time=2025-04-07T19:19:16.324-04:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu" time=2025-04-07T19:19:16.324-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="%LocalAppData%\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model modelDir\\model_blob_file --ctx-size 4096 --batch-size 512 --n-gpu-layers 32 --verbose --threads 4 --no-mmap --parallel 1 --port 54638" time=2025-04-07T19:19:16.324-04:00 level=DEBUG source=server.go:423 msg=subprocess environment="env variables" time=2025-04-07T19:19:16.329-04:00 level=INFO source=sched.go:451 msg="loaded runners" count=1 time=2025-04-07T19:19:16.329-04:00 level=INFO source=server.go:580 msg="waiting for llama runner to start responding" time=2025-04-07T19:19:16.329-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server error" time=2025-04-07T19:19:16.354-04:00 level=INFO source=runner.go:816 msg="starting ollama engine" time=2025-04-07T19:19:16.355-04:00 level=INFO source=runner.go:879 msg="Server listening on 127.0.0.1:54638" time=2025-04-07T19:19:16.408-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.name default="" time=2025-04-07T19:19:16.408-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.description default="" time=2025-04-07T19:19:16.408-04:00 level=INFO source=ggml.go:67 msg="" architecture=mistral3 file_type=Q2_K name="" description="" num_tensors=585 num_key_values=43 time=2025-04-07T19:19:16.408-04:00 level=DEBUG source=ggml.go:93 msg="skipping path which is not part of ollama" path=C:\Python311\Scripts ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0 ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 55 ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0 ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 20 ggml_backend_load_best: userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0 load_backend: loaded CPU backend from userDir\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll time=2025-04-07T19:19:16.437-04:00 level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(clang) time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_1.weight shape="[1024 5120]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_2.weight shape="[5120 5120]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.patch_merger.merging_layer.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output.weight shape="[5120 131072]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=token_embd.weight shape="[5120 131072]" dtype=14 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.437-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.438-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.encoder_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.patch_conv.weight shape="[14 14 3 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.439-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.440-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.441-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.442-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T19:19:16.443-04:00 level=INFO source=ggml.go:289 msg="model weights" buffer=CPU size="9.4 GiB" time=2025-04-07T19:19:16.582-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model" time=2025-04-07T19:19:16.582-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.09" time=2025-04-07T19:19:16.833-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.25" time=2025-04-07T19:19:17.084-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.43" time=2025-04-07T19:19:17.335-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.62" time=2025-04-07T19:19:17.587-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.81" time=2025-04-07T19:19:17.838-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.99" time=2025-04-07T19:19:17.866-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CPU buffer_type=CPU time=2025-04-07T19:19:17.867-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1 time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06 time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540 time=2025-04-07T19:19:17.869-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06 time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=token_embd.weight type=q6_K shape="[5120 131072]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.869-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.870-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.871-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.872-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.873-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.874-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.875-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output_norm.weight type=f32 shape=[5120] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output.weight type=q2_K shape="[5120 131072]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.patch_conv.weight type=f16 shape="[14 14 3 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.encoder_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.876-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.877-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.878-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.norm.weight type=f32 shape=[1024] time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_1.weight type=f16 shape="[1024 5120]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_2.weight type=f16 shape="[5120 5120]" time=2025-04-07T19:19:17.879-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.patch_merger.merging_layer.weight type=f16 shape="[4096 1024]" time=2025-04-07T19:19:18.088-04:00 level=INFO source=server.go:619 msg="llama runner started in 1.76 seconds" time=2025-04-07T19:19:18.088-04:00 level=DEBUG source=sched.go:464 msg="finished setting up runner" model=model_blob_file [GIN] 2025/04/07 - 19:19:18 | 200 | 2.1745499s | i.p.add.ress | POST "/api/generate" time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:468 msg="context for request finished" time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=model_blob_file duration=5m0s time=2025-04-07T19:19:18.089-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=model_blob_file refCount=0 time=2025-04-07T19:19:22.300-04:00 level=DEBUG source=sched.go:577 msg="evaluating already loaded" model=model_blob_file time=2025-04-07T19:19:22.330-04:00 level=DEBUG source=routes.go:1522 msg="chat request" images=0 prompt="[SYSTEM_PROMPT]You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYou power an AI assistant called Le Chat.\nYour knowledge base was last updated on 2023-10-01.\n\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \"What are some good restaurants around me?\" => \"Where are you?\" or \"When is the next flight to Tokyo\" => \"Where do you travel from?\").\nYou are always very attentive to dates, in particular you try to resolve dates (e.g. \"yesterday\" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.\nYou follow these instructions in all languages, and always respond to the user in the language they use or request.\nNext sections describe the capabilities that you have.\n\n# WEB BROWSING INSTRUCTIONS\n\nYou cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.\n\n# MULTI-MODAL INSTRUCTIONS\n\nYou have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.\nYou cannot read nor transcribe audio files or videos.[/SYSTEM_PROMPT][INST]tell a joke[/INST]" time=2025-04-07T19:19:22.365-04:00 level=DEBUG source=process_text.go:304 msg="adding bos token to prompt" id=1 time=2025-04-07T19:19:22.365-04:00 level=DEBUG source=cache.go:136 msg="loading cache slot" id=0 cache=0 prompt=362 used=0 remaining=362 [GIN] 2025/04/07 - 19:20:49 | 200 | 1m27s | i.p.add.ress | POST "/api/chat" time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:409 msg="context for request finished" time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=model_blob_file duration=5m0s time=2025-04-07T19:20:49.949-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=model_blob_file refCount=0 ```
Author
Owner

@rick-github commented on GitHub (Apr 8, 2025):

@maxi1134

Odd, it still shows some CPU usage in ollama ps

The output from ollama ps is calculated before the value of num_gpu is taken into account, so is incorrect.

<!-- gh-comment-id:2784902780 --> @rick-github commented on GitHub (Apr 8, 2025): @maxi1134 > Odd, it still shows some CPU usage in ollama ps The output from `ollama ps` is calculated before the value of `num_gpu` is taken into account, so is incorrect.
Author
Owner

@rick-github commented on GitHub (Apr 8, 2025):

@lowlyocean

time=2025-04-07T19:19:16.260-04:00 level=INFO source=server.go:138 msg=offload library=cuda
 layers.requested=32 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]"
 memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B"
 memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB"
 memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB"
 memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"
time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[]

Yep, decided that it couldn't fit a single layer anywhere. It might have something to do with the huge projector graph. it basically edges everything out. You should be able to get around that by adding %LocalAppData%\Programs\Ollama\lib\cuda_v12 to PATH in the server environment.

<!-- gh-comment-id:2784917541 --> @rick-github commented on GitHub (Apr 8, 2025): @lowlyocean ``` time=2025-04-07T19:19:16.260-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=32 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" time=2025-04-07T19:19:16.260-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[] ``` Yep, decided that it couldn't fit a single layer anywhere. It might have something to do with the huge projector graph. it basically edges everything out. You should be able to get around that by adding `%LocalAppData%\Programs\Ollama\lib\cuda_v12` to PATH in the server environment.
Author
Owner

@wbste commented on GitHub (Apr 8, 2025):

Same issue on my 3090, I can't get the full thing to load into VRAM. Latest Ollama, on Windows. Not using flash attn. Other 27B and 32B parameters models work fine 100% offloaded. Below says it needs 24.5 GB to do so, which seems high for a Q4_K_M quant?

time=2025-04-07T17:27:22.111-07:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="52.0 GiB" free_swap="53.2 GiB"
time=2025-04-07T17:27:22.111-07:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[21.8 GiB]"
time=2025-04-07T17:27:22.112-07:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=33 layers.split="" memory.available="[21.8 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.5 GiB" memory.required.partial="21.6 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[21.6 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"

Here's gemma3:27b for reference, 100% GPU offloading, same quant:

time=2025-04-07T17:35:35.913-07:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="51.9 GiB" free_swap="52.9 GiB"
time=2025-04-07T17:35:35.914-07:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[21.7 GiB]"
time=2025-04-07T17:35:35.915-07:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=63 layers.offload=63 layers.split="" memory.available="[21.7 GiB]" memory.gpu_overhead="0 B" memory.required.full="19.7 GiB" memory.required.partial="19.7 GiB" memory.required.kv="1.2 GiB" memory.required.allocations="[19.7 GiB]" memory.weights.total="15.4 GiB" memory.weights.repeating="14.3 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="565.0 MiB" memory.graph.partial="1.6 GiB" projector.weights="795.9 MiB" projector.graph="1.0 GiB"
<!-- gh-comment-id:2784937292 --> @wbste commented on GitHub (Apr 8, 2025): Same issue on my 3090, I can't get the full thing to load into VRAM. Latest Ollama, on Windows. Not using flash attn. Other 27B and 32B parameters models work fine 100% offloaded. Below says it needs 24.5 GB to do so, which seems high for a `Q4_K_M ` quant? ``` time=2025-04-07T17:27:22.111-07:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="52.0 GiB" free_swap="53.2 GiB" time=2025-04-07T17:27:22.111-07:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[21.8 GiB]" time=2025-04-07T17:27:22.112-07:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=33 layers.split="" memory.available="[21.8 GiB]" memory.gpu_overhead="0 B" memory.required.full="24.5 GiB" memory.required.partial="21.6 GiB" memory.required.kv="640.0 MiB" memory.required.allocations="[21.6 GiB]" memory.weights.total="13.1 GiB" memory.weights.repeating="12.7 GiB" memory.weights.nonrepeating="360.0 MiB" memory.graph.full="426.7 MiB" memory.graph.partial="426.7 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" ``` Here's gemma3:27b for reference, 100% GPU offloading, same quant: ``` time=2025-04-07T17:35:35.913-07:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="51.9 GiB" free_swap="52.9 GiB" time=2025-04-07T17:35:35.914-07:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[21.7 GiB]" time=2025-04-07T17:35:35.915-07:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=63 layers.offload=63 layers.split="" memory.available="[21.7 GiB]" memory.gpu_overhead="0 B" memory.required.full="19.7 GiB" memory.required.partial="19.7 GiB" memory.required.kv="1.2 GiB" memory.required.allocations="[19.7 GiB]" memory.weights.total="15.4 GiB" memory.weights.repeating="14.3 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="565.0 MiB" memory.graph.partial="1.6 GiB" projector.weights="795.9 MiB" projector.graph="1.0 GiB" ```
Author
Owner

@rick-github commented on GitHub (Apr 8, 2025):

@lowlyocean Can you add logs from the most recent run?

<!-- gh-comment-id:2784973481 --> @rick-github commented on GitHub (Apr 8, 2025): @lowlyocean Can you add logs from the most recent run?
Author
Owner

@lowlyocean commented on GitHub (Apr 8, 2025):

@lowlyocean Can you add logs from the most recent run?

2025/04/07 21:17:09 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://i.p.add.ress:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:pathto_model_root\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]"
time=2025-04-07T21:17:09.706-04:00 level=INFO source=images.go:458 msg="total blobs: 27"
time=2025-04-07T21:17:09.708-04:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0"
time=2025-04-07T21:17:09.709-04:00 level=INFO source=routes.go:1298 msg="Listening on i.p.add.ress:11434 (version 0.6.5)"
time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=sched.go:107 msg="starting llm scheduler"
time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=4 efficiency=0 threads=8
time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[paths]"
time=2025-04-07T21:17:09.710-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvml.dll"
time=2025-04-07T21:17:09.713-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-04-07T21:17:09.732-04:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\Windows\system32\nvml.dll
time=2025-04-07T21:17:09.733-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-04-07T21:17:09.733-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[paths]"
time=2025-04-07T21:17:09.734-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvcuda.dll"
time=2025-04-07T21:17:09.736-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll]"
initializing C:\Windows\system32\nvcuda.dll
dlsym: cuInit - address
dlsym: cuDriverGetVersion - address
dlsym: cuDeviceGetCount - address
dlsym: cuDeviceGet - address
dlsym: cuDeviceGetAttribute - address
dlsym: cuDeviceGetUuid - address
dlsym: cuDeviceGetName - address
dlsym: cuCtxCreate_v3 - address
dlsym: cuMemGetInfo_v2 - address
dlsym: cuCtxDestroy - address
calling cuInit
calling cuDriverGetVersion
raw version 0x2f30
CUDA driver version: 12.8
calling cuDeviceGetCount
device count 2
time=2025-04-07T21:17:09.752-04:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=2 library=C:\Windows\system32\nvcuda.dll
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA totalMem 12287 mb
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA freeMem 11247 mb
[GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] Compute Capability 8.6
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA totalMem 3071 mb
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA freeMem 2462 mb
[GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] Compute Capability 6.1
time=2025-04-07T21:17:09.918-04:00 level=INFO source=gpu.go:319 msg="detected OS VRAM overhead" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB"
time=2025-04-07T21:17:09.921-04:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
releasing cuda driver library
releasing nvml library
time=2025-04-07T21:17:09.922-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="11.0 GiB"
time=2025-04-07T21:17:09.922-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB"
[GIN] 2025/04/07 - 21:17:10 | 200 |            0s |     i.p.add.ress | HEAD     "/"
[GIN] 2025/04/07 - 21:17:10 | 200 |     50.2812ms |     i.p.add.ress | POST     "/api/show"
time=2025-04-07T21:17:10.166-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.3 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.181-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="11.0 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.197-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.198-04:00 level=DEBUG source=sched.go:183 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=6 gpu_count=2
time=2025-04-07T21:17:10.245-04:00 level=DEBUG source=sched.go:226 msg="loading first model" model=blob_file
time=2025-04-07T21:17:10.245-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]"
time=2025-04-07T21:17:10.246-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.261-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.277-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]"
time=2025-04-07T21:17:10.279-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.292-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.308-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]"
time=2025-04-07T21:17:10.310-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.323-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.339-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]"
time=2025-04-07T21:17:10.341-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.354-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.370-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.385-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.401-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB"
time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T21:17:10.403-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.416-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.432-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.447-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.463-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.463-04:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="50.2 GiB" free_swap="42.2 GiB"
time=2025-04-07T21:17:10.463-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]"
time=2025-04-07T21:17:10.464-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB"
time=2025-04-07T21:17:10.478-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB"
time=2025-04-07T21:17:10.494-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB"
releasing nvml library
time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB"
time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers"
time=2025-04-07T21:17:10.495-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB"
time=2025-04-07T21:17:10.495-04:00 level=WARN source=server.go:173 msg="flash attention enabled but not supported by gpu"
time=2025-04-07T21:17:10.495-04:00 level=WARN source=server.go:196 msg="quantized kv cache requested but flash attention disabled" type=q8_0
time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[]
time=2025-04-07T21:17:10.551-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1
time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06
time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540
time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06
time=2025-04-07T21:17:10.568-04:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu"
time=2025-04-07T21:17:10.569-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model pathto_model_root\\blobfile --ctx-size 4096 --batch-size 512 --verbose --threads 4 --no-mmap --parallel 1 --port 61075"
time=2025-04-07T21:17:10.569-04:00 level=DEBUG source=server.go:423 msg=subprocess environment="[env vars]"
time=2025-04-07T21:17:10.572-04:00 level=INFO source=sched.go:451 msg="loaded runners" count=1
time=2025-04-07T21:17:10.572-04:00 level=INFO source=server.go:580 msg="waiting for llama runner to start responding"
time=2025-04-07T21:17:10.573-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server error"
time=2025-04-07T21:17:10.598-04:00 level=INFO source=runner.go:816 msg="starting ollama engine"
time=2025-04-07T21:17:10.600-04:00 level=INFO source=runner.go:879 msg="Server listening on 127.0.0.1:61075"
time=2025-04-07T21:17:10.656-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.name default=""
time=2025-04-07T21:17:10.656-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.description default=""
time=2025-04-07T21:17:10.656-04:00 level=INFO source=ggml.go:67 msg="" architecture=mistral3 file_type=Q2_K name="" description="" num_tensors=585 num_key_values=43
time=2025-04-07T21:17:10.656-04:00 level=DEBUG source=ggml.go:99 msg="ggml backend load all from path" path=C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes
  Device 1: NVIDIA GeForce GTX 1060 3GB, compute capability 6.1, VMM: yes
time=2025-04-07T21:17:10.824-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model"
load_backend: loaded CUDA backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll
time=2025-04-07T21:17:10.840-04:00 level=DEBUG source=ggml.go:99 msg="ggml backend load all from path" path=C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 55
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 20
ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0
load_backend: loaded CPU backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll
time=2025-04-07T21:17:10.848-04:00 level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 CUDA.1.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.1.USE_GRAPHS=1 CUDA.1.PEER_MAX_BATCH_SIZE=128 compiler=cgo(clang)
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_1.weight shape="[1024 5120]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_2.weight shape="[5120 5120]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.patch_merger.merging_layer.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output.weight shape="[5120 131072]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=token_embd.weight shape="[5120 131072]" dtype=14 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.encoder_norm.weight shape=[1024] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.patch_conv.weight shape="[14 14 3 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU
time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU
time=2025-04-07T21:17:10.911-04:00 level=INFO source=ggml.go:289 msg="model weights" buffer=CPU size="9.4 GiB"
time=2025-04-07T21:17:11.075-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.11"
time=2025-04-07T21:17:11.326-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.27"
time=2025-04-07T21:17:11.576-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.45"
time=2025-04-07T21:17:11.828-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.61"
time=2025-04-07T21:17:12.079-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.77"
time=2025-04-07T21:17:12.330-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.94"
time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CUDA0 buffer_type=CUDA0
time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CUDA1 buffer_type=CUDA1
time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CPU buffer_type=CUDA_Host
time=2025-04-07T21:17:12.425-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1
time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06
time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540
time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=token_embd.weight type=q6_K shape="[5120 131072]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_q.weight type=q2_K shape="[5120 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_k.weight type=q2_K shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_v.weight type=f16 shape="[5120 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_output.weight type=q3_K shape="[4096 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_up.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_down.weight type=q3_K shape="[32768 5120]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_gate.weight type=q2_K shape="[5120 32768]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output_norm.weight type=f32 shape=[5120]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output.weight type=q2_K shape="[5120 131072]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.patch_conv.weight type=f16 shape="[14 14 3 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.encoder_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_q.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_k.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_v.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_output.weight type=f16 shape="[1024 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_gate.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_up.weight type=f16 shape="[1024 4096]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_down.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.norm.weight type=f32 shape=[1024]
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_1.weight type=f16 shape="[1024 5120]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_2.weight type=f16 shape="[5120 5120]"
time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.patch_merger.merging_layer.weight type=f16 shape="[4096 1024]"
time=2025-04-07T21:17:12.580-04:00 level=INFO source=server.go:619 msg="llama runner started in 2.01 seconds"
time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:464 msg="finished setting up runner" model=blob_file
[GIN] 2025/04/07 - 21:17:12 | 200 |    2.4387179s |     i.p.add.ress | POST     "/api/generate"
time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:468 msg="context for request finished"
time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=blob_file duration=5m0s
time=2025-04-07T21:17:12.581-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=blob_file refCount=0
time=2025-04-07T21:17:17.815-04:00 level=DEBUG source=sched.go:577 msg="evaluating already loaded" model=blob_file
time=2025-04-07T21:17:17.854-04:00 level=DEBUG source=routes.go:1522 msg="chat request" images=0 prompt="[SYSTEM_PROMPT]You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYou power an AI assistant called Le Chat.\nYour knowledge base was last updated on 2023-10-01.\n\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \"What are some good restaurants around me?\" => \"Where are you?\" or \"When is the next flight to Tokyo\" => \"Where do you travel from?\").\nYou are always very attentive to dates, in particular you try to resolve dates (e.g. \"yesterday\" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.\nYou follow these instructions in all languages, and always respond to the user in the language they use or request.\nNext sections describe the capabilities that you have.\n\n# WEB BROWSING INSTRUCTIONS\n\nYou cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.\n\n# MULTI-MODAL INSTRUCTIONS\n\nYou have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.\nYou cannot read nor transcribe audio files or videos.[/SYSTEM_PROMPT][INST]tell a joke[/INST]"
time=2025-04-07T21:17:17.889-04:00 level=DEBUG source=process_text.go:304 msg="adding bos token to prompt" id=1
time=2025-04-07T21:17:17.891-04:00 level=DEBUG source=cache.go:136 msg="loading cache slot" id=0 cache=0 prompt=362 used=0 remaining=362
[GIN] 2025/04/07 - 21:17:25 | 200 |    7.7953074s |     i.p.add.ress | POST     "/api/chat"
time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:409 msg="context for request finished"
time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=blob_file duration=5m0s
time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=blob_file refCount=0
<!-- gh-comment-id:2784996300 --> @lowlyocean commented on GitHub (Apr 8, 2025): > [@lowlyocean](https://github.com/lowlyocean) Can you add logs from the most recent run? ``` 2025/04/07 21:17:09 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://i.p.add.ress:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:pathto_model_root\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]" time=2025-04-07T21:17:09.706-04:00 level=INFO source=images.go:458 msg="total blobs: 27" time=2025-04-07T21:17:09.708-04:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0" time=2025-04-07T21:17:09.709-04:00 level=INFO source=routes.go:1298 msg="Listening on i.p.add.ress:11434 (version 0.6.5)" time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=sched.go:107 msg="starting llm scheduler" time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-04-07T21:17:09.709-04:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=4 efficiency=0 threads=8 time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-04-07T21:17:09.709-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[paths]" time=2025-04-07T21:17:09.710-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvml.dll" time=2025-04-07T21:17:09.713-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-04-07T21:17:09.732-04:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\Windows\system32\nvml.dll time=2025-04-07T21:17:09.733-04:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-04-07T21:17:09.733-04:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[paths]" time=2025-04-07T21:17:09.734-04:00 level=DEBUG source=gpu.go:529 msg="skipping PhysX cuda library path" path="C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common\\nvcuda.dll" time=2025-04-07T21:17:09.736-04:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Windows\\system32\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll]" initializing C:\Windows\system32\nvcuda.dll dlsym: cuInit - address dlsym: cuDriverGetVersion - address dlsym: cuDeviceGetCount - address dlsym: cuDeviceGet - address dlsym: cuDeviceGetAttribute - address dlsym: cuDeviceGetUuid - address dlsym: cuDeviceGetName - address dlsym: cuCtxCreate_v3 - address dlsym: cuMemGetInfo_v2 - address dlsym: cuCtxDestroy - address calling cuInit calling cuDriverGetVersion raw version 0x2f30 CUDA driver version: 12.8 calling cuDeviceGetCount device count 2 time=2025-04-07T21:17:09.752-04:00 level=DEBUG source=gpu.go:125 msg="detected GPUs" count=2 library=C:\Windows\system32\nvcuda.dll [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA totalMem 12287 mb [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] CUDA freeMem 11247 mb [GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a] Compute Capability 8.6 [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA totalMem 3071 mb [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] CUDA freeMem 2462 mb [GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4] Compute Capability 6.1 time=2025-04-07T21:17:09.918-04:00 level=INFO source=gpu.go:319 msg="detected OS VRAM overhead" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" time=2025-04-07T21:17:09.921-04:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." releasing cuda driver library releasing nvml library time=2025-04-07T21:17:09.922-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="11.0 GiB" time=2025-04-07T21:17:09.922-04:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" [GIN] 2025/04/07 - 21:17:10 | 200 | 0s | i.p.add.ress | HEAD "/" [GIN] 2025/04/07 - 21:17:10 | 200 | 50.2812ms | i.p.add.ress | POST "/api/show" time=2025-04-07T21:17:10.166-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.3 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.181-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="11.0 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.197-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.198-04:00 level=DEBUG source=sched.go:183 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=6 gpu_count=2 time=2025-04-07T21:17:10.245-04:00 level=DEBUG source=sched.go:226 msg="loading first model" model=blob_file time=2025-04-07T21:17:10.245-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]" time=2025-04-07T21:17:10.246-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.261-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.277-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.278-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]" time=2025-04-07T21:17:10.279-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.292-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.308-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.309-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[9.7 GiB]" time=2025-04-07T21:17:10.310-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.323-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.339-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.340-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=1 available="[2.4 GiB]" time=2025-04-07T21:17:10.341-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.354-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.370-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T21:17:10.371-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.385-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.401-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="232.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="853.3 MiB" full_offload="853.3 MiB" time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.402-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T21:17:10.403-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.416-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.432-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.433-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.447-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.463-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.463-04:00 level=INFO source=server.go:105 msg="system memory" total="63.9 GiB" free="50.2 GiB" free_swap="42.2 GiB" time=2025-04-07T21:17:10.463-04:00 level=DEBUG source=memory.go:108 msg=evaluating library=cuda gpu_count=2 available="[9.7 GiB 2.4 GiB]" time=2025-04-07T21:17:10.464-04:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="63.9 GiB" before.free="50.2 GiB" before.free_swap="42.2 GiB" now.total="63.9 GiB" now.free="50.2 GiB" now.free_swap="42.2 GiB" time=2025-04-07T21:17:10.478-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a name="NVIDIA GeForce RTX 3060" overhead="0 B" before.total="12.0 GiB" before.free="9.7 GiB" now.total="12.0 GiB" now.free="9.7 GiB" now.used="2.3 GiB" time=2025-04-07T21:17:10.494-04:00 level=DEBUG source=gpu.go:441 msg="updating cuda memory data" gpu=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 name="NVIDIA GeForce GTX 1060 3GB" overhead="489.1 MiB" before.total="3.0 GiB" before.free="2.4 GiB" now.total="3.0 GiB" now.free="2.4 GiB" now.used="120.7 MiB" releasing nvml library time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-5b2d7b81-aea2-60ed-5c17-c4dcd8ecac9a library=cuda variant=v12 compute=8.6 driver=12.8 name="NVIDIA GeForce RTX 3060" total="12.0 GiB" available="9.7 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:194 msg="gpu has too little memory to allocate any layers" id=GPU-9ebc57b9-f5c2-a302-6b56-1dfb850658b4 library=cuda variant=v12 compute=6.1 driver=12.8 name="NVIDIA GeForce GTX 1060 3GB" total="3.0 GiB" available="2.4 GiB" minimum_memory=479199232 layer_size="208.6 MiB" gpu_zer_overhead="9.5 GiB" partial_offload="213.3 MiB" full_offload="213.3 MiB" time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=memory.go:338 msg="insufficient VRAM to load any model layers" time=2025-04-07T21:17:10.495-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" time=2025-04-07T21:17:10.495-04:00 level=WARN source=server.go:173 msg="flash attention enabled but not supported by gpu" time=2025-04-07T21:17:10.495-04:00 level=WARN source=server.go:196 msg="quantized kv cache requested but flash attention disabled" type=q8_0 time=2025-04-07T21:17:10.495-04:00 level=DEBUG source=server.go:262 msg="compatible gpu libraries" compatible=[] time=2025-04-07T21:17:10.551-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1 time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06 time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540 time=2025-04-07T21:17:10.559-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06 time=2025-04-07T21:17:10.568-04:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu" time=2025-04-07T21:17:10.569-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model pathto_model_root\\blobfile --ctx-size 4096 --batch-size 512 --verbose --threads 4 --no-mmap --parallel 1 --port 61075" time=2025-04-07T21:17:10.569-04:00 level=DEBUG source=server.go:423 msg=subprocess environment="[env vars]" time=2025-04-07T21:17:10.572-04:00 level=INFO source=sched.go:451 msg="loaded runners" count=1 time=2025-04-07T21:17:10.572-04:00 level=INFO source=server.go:580 msg="waiting for llama runner to start responding" time=2025-04-07T21:17:10.573-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server error" time=2025-04-07T21:17:10.598-04:00 level=INFO source=runner.go:816 msg="starting ollama engine" time=2025-04-07T21:17:10.600-04:00 level=INFO source=runner.go:879 msg="Server listening on 127.0.0.1:61075" time=2025-04-07T21:17:10.656-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.name default="" time=2025-04-07T21:17:10.656-04:00 level=WARN source=ggml.go:152 msg="key not found" key=general.description default="" time=2025-04-07T21:17:10.656-04:00 level=INFO source=ggml.go:67 msg="" architecture=mistral3 file_type=Q2_K name="" description="" num_tensors=585 num_key_values=43 time=2025-04-07T21:17:10.656-04:00 level=DEBUG source=ggml.go:99 msg="ggml backend load all from path" path=C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12 ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 2 CUDA devices: Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes Device 1: NVIDIA GeForce GTX 1060 3GB, compute capability 6.1, VMM: yes time=2025-04-07T21:17:10.824-04:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model" load_backend: loaded CUDA backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll time=2025-04-07T21:17:10.840-04:00 level=DEBUG source=ggml.go:99 msg="ggml backend load all from path" path=C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0 ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 55 ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0 ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 20 ggml_backend_load_best: C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0 load_backend: loaded CPU backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll time=2025-04-07T21:17:10.848-04:00 level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 CUDA.1.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.1.USE_GRAPHS=1 CUDA.1.PEER_MAX_BATCH_SIZE=128 compiler=cgo(clang) time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_1.weight shape="[1024 5120]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.linear_2.weight shape="[5120 5120]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=mm.patch_merger.merging_layer.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output.weight shape="[5120 131072]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=output_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=token_embd.weight shape="[5120 131072]" dtype=14 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.0.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.1.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.10.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.11.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.12.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.13.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.14.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.904-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.15.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.16.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.17.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.18.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.19.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.2.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.20.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.21.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.22.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.23.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.3.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.4.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.5.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.905-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.6.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.7.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.8.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_k.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_output.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_q.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.attn_v.weight shape="[1024 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_down.weight shape="[4096 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_gate.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.blk.9.ffn_up.weight shape="[1024 4096]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.encoder_norm.weight shape=[1024] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=v.patch_conv.weight shape="[14 14 3 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.0.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.1.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.2.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.3.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.4.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.5.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.6.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.7.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.906-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.8.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.9.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.10.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.11.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.12.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.907-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.13.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.14.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.15.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.16.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.17.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.18.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.19.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.20.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.21.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.22.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.908-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.23.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.24.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.25.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.26.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.27.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.28.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.29.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.30.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.909-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.31.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.32.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.33.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.34.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.35.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.36.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.37.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.38.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_k.weight shape="[5120 1024]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_output.weight shape="[4096 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_q.weight shape="[5120 4096]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.attn_v.weight shape="[5120 1024]" dtype=1 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_down.weight shape="[32768 5120]" dtype=11 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_gate.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_norm.weight shape=[5120] dtype=0 buffer_type=CPU time=2025-04-07T21:17:10.910-04:00 level=DEBUG source=ggml.go:220 msg="created tensor" name=blk.39.ffn_up.weight shape="[5120 32768]" dtype=10 buffer_type=CPU time=2025-04-07T21:17:10.911-04:00 level=INFO source=ggml.go:289 msg="model weights" buffer=CPU size="9.4 GiB" time=2025-04-07T21:17:11.075-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.11" time=2025-04-07T21:17:11.326-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.27" time=2025-04-07T21:17:11.576-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.45" time=2025-04-07T21:17:11.828-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.61" time=2025-04-07T21:17:12.079-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.77" time=2025-04-07T21:17:12.330-04:00 level=DEBUG source=server.go:625 msg="model load progress 0.94" time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CUDA0 buffer_type=CUDA0 time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CUDA1 buffer_type=CUDA1 time=2025-04-07T21:17:12.424-04:00 level=INFO source=ggml.go:388 msg="compute graph" backend=CPU buffer_type=CUDA_Host time=2025-04-07T21:17:12.425-04:00 level=WARN source=ggml.go:152 msg="key not found" key=tokenizer.ggml.pretokenizer default="[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]*[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]+|[^\\r\\n\\p{L}\\p{N}]?[\\p{Lu}\\p{Lt}\\p{Lm}\\p{Lo}\\p{M}]+[\\p{Ll}\\p{Lm}\\p{Lo}\\p{M}]*|\\p{N}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n/]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.rope.freq_scale default=1 time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.attention.layer_norm_epsilon default=9.999999747378752e-06 time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.vision.longest_edge default=1540 time=2025-04-07T21:17:12.428-04:00 level=WARN source=ggml.go:152 msg="key not found" key=mistral3.text_config.rms_norm_eps default=9.999999747378752e-06 time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=token_embd.weight type=q6_K shape="[5120 131072]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.0.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.1.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.2.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.3.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.4.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.5.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.6.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.7.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.8.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.428-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.9.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.10.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.11.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.12.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.13.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.14.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.15.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.16.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.17.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.18.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.19.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.429-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.20.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.21.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.22.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.23.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.24.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.25.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.26.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.27.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.28.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.29.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.30.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.31.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.430-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.32.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.33.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.34.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.35.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.36.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.37.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.38.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_q.weight type=q2_K shape="[5120 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_k.weight type=q2_K shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_v.weight type=f16 shape="[5120 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.attn_output.weight type=q3_K shape="[4096 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_up.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_down.weight type=q3_K shape="[32768 5120]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=blk.39.ffn_gate.weight type=q2_K shape="[5120 32768]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output_norm.weight type=f32 shape=[5120] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=output.weight type=q2_K shape="[5120 131072]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.patch_conv.weight type=f16 shape="[14 14 3 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.encoder_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.0.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.431-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.1.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.2.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.3.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.4.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.5.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.6.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.7.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.432-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.8.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.9.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.10.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.11.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.433-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.12.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.13.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.14.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.15.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.434-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.16.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.17.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.18.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.19.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.435-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.20.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.21.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.22.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.436-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_q.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_k.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_v.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.attn_output.weight type=f16 shape="[1024 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_gate.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_up.weight type=f16 shape="[1024 4096]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=v.blk.23.ffn_down.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.norm.weight type=f32 shape=[1024] time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_1.weight type=f16 shape="[1024 5120]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.linear_2.weight type=f16 shape="[5120 5120]" time=2025-04-07T21:17:12.437-04:00 level=DEBUG source=model.go:205 msg="found tensor" name=mm.patch_merger.merging_layer.weight type=f16 shape="[4096 1024]" time=2025-04-07T21:17:12.580-04:00 level=INFO source=server.go:619 msg="llama runner started in 2.01 seconds" time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:464 msg="finished setting up runner" model=blob_file [GIN] 2025/04/07 - 21:17:12 | 200 | 2.4387179s | i.p.add.ress | POST "/api/generate" time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:468 msg="context for request finished" time=2025-04-07T21:17:12.580-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=blob_file duration=5m0s time=2025-04-07T21:17:12.581-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=blob_file refCount=0 time=2025-04-07T21:17:17.815-04:00 level=DEBUG source=sched.go:577 msg="evaluating already loaded" model=blob_file time=2025-04-07T21:17:17.854-04:00 level=DEBUG source=routes.go:1522 msg="chat request" images=0 prompt="[SYSTEM_PROMPT]You are Mistral Small 3.1, a Large Language Model (LLM) created by Mistral AI, a French startup headquartered in Paris.\nYou power an AI assistant called Le Chat.\nYour knowledge base was last updated on 2023-10-01.\n\nWhen you're not sure about some information, you say that you don't have the information and don't make up anything.\nIf the user's question is not clear, ambiguous, or does not provide enough context for you to accurately answer the question, you do not try to answer it right away and you rather ask the user to clarify their request (e.g. \"What are some good restaurants around me?\" => \"Where are you?\" or \"When is the next flight to Tokyo\" => \"Where do you travel from?\").\nYou are always very attentive to dates, in particular you try to resolve dates (e.g. \"yesterday\" is {yesterday}) and when asked about information at specific dates, you discard information that is at another date.\nYou follow these instructions in all languages, and always respond to the user in the language they use or request.\nNext sections describe the capabilities that you have.\n\n# WEB BROWSING INSTRUCTIONS\n\nYou cannot perform any web search or access internet to open URLs, links etc. If it seems like the user is expecting you to do so, you clarify the situation and ask the user to copy paste the text directly in the chat.\n\n# MULTI-MODAL INSTRUCTIONS\n\nYou have the ability to read images, but you cannot generate images. You also cannot transcribe audio files or videos.\nYou cannot read nor transcribe audio files or videos.[/SYSTEM_PROMPT][INST]tell a joke[/INST]" time=2025-04-07T21:17:17.889-04:00 level=DEBUG source=process_text.go:304 msg="adding bos token to prompt" id=1 time=2025-04-07T21:17:17.891-04:00 level=DEBUG source=cache.go:136 msg="loading cache slot" id=0 cache=0 prompt=362 used=0 remaining=362 [GIN] 2025/04/07 - 21:17:25 | 200 | 7.7953074s | i.p.add.ress | POST "/api/chat" time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:409 msg="context for request finished" time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:341 msg="runner with non-zero duration has gone idle, adding timer" modelPath=blob_file duration=5m0s time=2025-04-07T21:17:25.583-04:00 level=DEBUG source=sched.go:359 msg="after processing request finished event" modelPath=blob_file refCount=0 ```
Author
Owner

@rick-github commented on GitHub (Apr 8, 2025):

time=2025-04-07T21:17:10.495-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1
 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B"
 memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB"
 memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB"
 memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB"
 projector.weights="769.3 MiB" projector.graph="8.8 GiB"

time=2025-04-07T21:17:10.569-04:00 level=INFO source=server.go:405 msg="starting llama server"
 cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner
 --ollama-engine
 --model pathto_model_root\\blobfile
 --ctx-size 4096
 --batch-size 512
 --verbose --threads 4 --no-mmap --parallel 1 --port 61075"

load_backend: loaded CUDA backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll

It found the GPU backend but num_gpu was unset, so it loaded the number of layers it calculated that would fit, ie zero.

<!-- gh-comment-id:2785005804 --> @rick-github commented on GitHub (Apr 8, 2025): ``` time=2025-04-07T21:17:10.495-04:00 level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=0 layers.split="" memory.available="[9.7 GiB 2.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.4 GiB" memory.required.partial="0 B" memory.required.kv="320.0 MiB" memory.required.allocations="[0 B 0 B]" memory.weights.total="8.0 GiB" memory.weights.repeating="7.8 GiB" memory.weights.nonrepeating="210.0 MiB" memory.graph.full="213.3 MiB" memory.graph.partial="213.3 MiB" projector.weights="769.3 MiB" projector.graph="8.8 GiB" time=2025-04-07T21:17:10.569-04:00 level=INFO source=server.go:405 msg="starting llama server" cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model pathto_model_root\\blobfile --ctx-size 4096 --batch-size 512 --verbose --threads 4 --no-mmap --parallel 1 --port 61075" load_backend: loaded CUDA backend from C:\Users\user\AppData\Local\Programs\Ollama\lib\ollama\cuda_v12\ggml-cuda.dll ``` It found the GPU backend but `num_gpu` was unset, so it loaded the number of layers it calculated that would fit, ie zero.
Author
Owner

@lowlyocean commented on GitHub (Apr 8, 2025):

With the default context window of 2048, I was able to force num_gpu to 41 (all the layers) and it seems to load onto the GPU (despite ollama ps showing 100% CPU).

However, increasing context window to 8192 causes a crash:

ggml_backend_cuda_buffer_type_alloc_buffer: allocating 248.01 MiB on device 1: cudaMalloc failed: out of memory
ggml_gallocr_reserve_n: failed to allocate CUDA1 buffer of size 260055040

Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually?

<!-- gh-comment-id:2785090413 --> @lowlyocean commented on GitHub (Apr 8, 2025): With the default context window of 2048, I was able to force num_gpu to 41 (all the layers) and it seems to load onto the GPU (despite ollama ps showing 100% CPU). However, increasing context window to 8192 causes a crash: ``` ggml_backend_cuda_buffer_type_alloc_buffer: allocating 248.01 MiB on device 1: cudaMalloc failed: out of memory ggml_gallocr_reserve_n: failed to allocate CUDA1 buffer of size 260055040 ``` Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually?
Author
Owner

@arturo-air commented on GitHub (Apr 8, 2025):

Something similar is happening to me. My ollama version is 0.6.5, and I just pulled the mistral model ollama run mistral-small3.1 (hash b9aaf0c2586a).

I am using a 4090, and when I run the new mistral model, it does not allocate 100% on the GPU:

arturo@thinkpad:~$ nvidia-smi 
Tue Apr  8 05:45:51 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:01:00.0 Off |                  Off |
|  0%   40C    P0              69W / 450W |    112MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2020      G   /usr/lib/xorg/Xorg                           86MiB |
|    0   N/A  N/A      2361      G   /usr/bin/gnome-shell                         13MiB |
+---------------------------------------------------------------------------------------+
arturo@thinkpad:~$ ollama run mistral-small3.1:latest
>>> hello
Hello! How can I assist you today?

>>> 
arturo@thinkpad:~$ ollama ps
NAME                       ID              SIZE     PROCESSOR         UNTIL              
mistral-small3.1:latest    b9aaf0c2586a    26 GB    6%/94% CPU/GPU    4 minutes from now    
arturo@thinkpad:~$ nvidia-smi 
Tue Apr  8 05:46:43 2025       
.........
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2020      G   /usr/lib/xorg/Xorg                           86MiB |
|    0   N/A  N/A      2361      G   /usr/bin/gnome-shell                         13MiB |
|    0   N/A  N/A   1229081      C   /usr/local/bin/ollama                     13298MiB |
+---------------------------------------------------------------------------------------+

On the other hand, if I do the same operation with other model, like gemma3:27b, this error doesn't happen:

arturo@thinkpad:~$ ollama run gemma3:27b
>>> hello
Hello there! 👋 

How can I help you today? Just let me know what you're thinking, or if you just wanted to say hi, that's great too! 
.....
>>> 
arturo@thinkpad:~$ ollama ps
NAME          ID              SIZE     PROCESSOR    UNTIL              
gemma3:27b    30ddded7fba6    22 GB    100% GPU     4 minutes from now    

So my guess is that ollama is estimating wrong the new mistral model size, since it also is smaller than the gemma one:

arturo@thinkpad:~$ ollama ls
NAME                          ID              SIZE      MODIFIED       
mistral-small3.1:latest       b9aaf0c2586a    15 GB     35 minutes ago    
gemma3:27b                    30ddded7fba6    17 GB     3 weeks ago
<!-- gh-comment-id:2785318109 --> @arturo-air commented on GitHub (Apr 8, 2025): Something similar is happening to me. My ollama version is 0.6.5, and I just pulled the mistral model `ollama run mistral-small3.1` (hash b9aaf0c2586a). I am using a 4090, and when I run the new mistral model, it does not allocate 100% on the GPU: ``` arturo@thinkpad:~$ nvidia-smi Tue Apr 8 05:45:51 2025 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 Off | Off | | 0% 40C P0 69W / 450W | 112MiB / 24564MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 2020 G /usr/lib/xorg/Xorg 86MiB | | 0 N/A N/A 2361 G /usr/bin/gnome-shell 13MiB | +---------------------------------------------------------------------------------------+ arturo@thinkpad:~$ ollama run mistral-small3.1:latest >>> hello Hello! How can I assist you today? >>> arturo@thinkpad:~$ ollama ps NAME ID SIZE PROCESSOR UNTIL mistral-small3.1:latest b9aaf0c2586a 26 GB 6%/94% CPU/GPU 4 minutes from now arturo@thinkpad:~$ nvidia-smi Tue Apr 8 05:46:43 2025 ......... +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 2020 G /usr/lib/xorg/Xorg 86MiB | | 0 N/A N/A 2361 G /usr/bin/gnome-shell 13MiB | | 0 N/A N/A 1229081 C /usr/local/bin/ollama 13298MiB | +---------------------------------------------------------------------------------------+ ``` On the other hand, if I do the same operation with other model, like `gemma3:27b`, this error doesn't happen: ``` arturo@thinkpad:~$ ollama run gemma3:27b >>> hello Hello there! 👋 How can I help you today? Just let me know what you're thinking, or if you just wanted to say hi, that's great too! ..... >>> arturo@thinkpad:~$ ollama ps NAME ID SIZE PROCESSOR UNTIL gemma3:27b 30ddded7fba6 22 GB 100% GPU 4 minutes from now ``` So my guess is that ollama is estimating wrong the new mistral model size, since it also is smaller than the gemma one: ``` arturo@thinkpad:~$ ollama ls NAME ID SIZE MODIFIED mistral-small3.1:latest b9aaf0c2586a 15 GB 35 minutes ago gemma3:27b 30ddded7fba6 17 GB 3 weeks ago ```
Author
Owner

@metal3d commented on GitHub (Apr 8, 2025):

I've got the same problem on one server with Deepseek-R1, the model is not filled inside the 4 Cards with 8Go VRAM (so 32Go available) while it works on my computer (same distribution, Fedora 41) with one card having 24Go VRAM.

<!-- gh-comment-id:2786571426 --> @metal3d commented on GitHub (Apr 8, 2025): I've got the same problem on one server with Deepseek-R1, the model is not filled inside the 4 Cards with 8Go VRAM (so 32Go available) while it works on my computer (same distribution, Fedora 41) with one card having 24Go VRAM.
Author
Owner
<!-- gh-comment-id:2786610135 --> @vini-muchulski commented on GitHub (Apr 8, 2025): try this! https://www.reddit.com/r/LocalLLaMA/comments/1judvfg/how_to_fix_slow_inference_speed_of_mistralsmall/
Author
Owner

@rick-github commented on GitHub (Apr 8, 2025):

Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually?

The underlying issue is that you don't have enough VRAM to run the model. Setting num_gpu and PATH tries to skirt the issue, but then you end up with OOMs.

So my guess is that ollama is estimating wrong the new mistral model size, since it also is smaller than the gemma one:

They are two different models with different architectures, context window, block count, etc. The memory estimation will not be the same.

the model is not filled inside the 4 Cards with 8Go VRAM (so 32Go available)

When a model is shared across multiple devices, the amount of overhead goes up. That is, a model loaded into a GPU consists of weights, context buffer, computation graph, projector data structures, etc. Some of those allocations need to be replicated across all devices, so multiple copies of those allocations increases the overall VRAM requirement.

<!-- gh-comment-id:2786618920 --> @rick-github commented on GitHub (Apr 8, 2025): > Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually? The underlying issue is that you don't have enough VRAM to run the model. Setting `num_gpu` and PATH tries to skirt the issue, but then you end up with OOMs. > So my guess is that ollama is estimating wrong the new mistral model size, since it also is smaller than the gemma one: They are two different models with different architectures, context window, block count, etc. The memory estimation will not be the same. > the model is not filled inside the 4 Cards with 8Go VRAM (so 32Go available) When a model is shared across multiple devices, the amount of overhead goes up. That is, a model loaded into a GPU consists of weights, context buffer, computation graph, projector data structures, etc. Some of those allocations need to be replicated across all devices, so multiple copies of those allocations increases the overall VRAM requirement.
Author
Owner

@rick-github commented on GitHub (Apr 8, 2025):

See here for ways to deal with OOMs.

<!-- gh-comment-id:2786643854 --> @rick-github commented on GitHub (Apr 8, 2025): See [here](https://github.com/ollama/ollama/issues/8597#issuecomment-2614533288) for ways to deal with OOMs.
Author
Owner

@lowlyocean commented on GitHub (Apr 8, 2025):

Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually?

The underlying issue is that you don't have enough VRAM to run the model. Setting num_gpu and PATH tries to skirt the issue, but then you end up with OOMs.

I meant specifically the issue that causes it to think that 0 layers can be fit on the GPU even though the GPU is bigger than the entire model

<!-- gh-comment-id:2786726270 --> @lowlyocean commented on GitHub (Apr 8, 2025): > > Is there already someone tackling the underlying issue causing the need to have to set num_gpu and manipulate PATH manually? > > The underlying issue is that you don't have enough VRAM to run the model. Setting `num_gpu` and PATH tries to skirt the issue, but then you end up with OOMs. I meant specifically the issue that causes it to think that 0 layers can be fit on the GPU even though the GPU is bigger than the entire model
Author
Owner

@rick-github commented on GitHub (Apr 8, 2025):

The memory estimation logic is receiving attention and there's always room for improvement, but fundamentally you don't have enough VRAM to host the model with the default parameters. There are ways to tweak those parameters as a linked above, but as I mentioned about splitting the model across multiple devices, the extra overhead involved with your setup is just going to make it hard to host mistral-small.

<!-- gh-comment-id:2786758369 --> @rick-github commented on GitHub (Apr 8, 2025): The memory estimation logic is receiving attention and there's always room for improvement, but fundamentally you don't have enough VRAM to host the model with the default parameters. There are ways to tweak those parameters as a linked above, but as I mentioned about splitting the model across multiple devices, the extra overhead involved with your setup is just going to make it hard to host mistral-small.
Author
Owner

@lowlyocean commented on GitHub (Apr 8, 2025):

The memory estimation logic is receiving attention and there's always room for improvement, but fundamentally you don't have enough VRAM to host the model with the default parameters. There are ways to tweak those parameters as a linked above, but as I mentioned about splitting the model across multiple devices, the extra overhead involved with your setup is just going to make it hard to host mistral-small.

Thanks, I want to point out that when I was running the Bartowski Q2_K quant of Mistral Small 3.1 from HF
ollama pull hf.co/bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF:Q2_K
it is able to work with context of 8192, flash attention on, and KV cache q8_0 with ollama ps showing that it automatically fit entirely into GPU without needing to add libraries to a PATH or setting num_gpu manually.

To be clear: there seems to be something specific about having quantized the official Ollama version that's causing this edge case.

<!-- gh-comment-id:2786769678 --> @lowlyocean commented on GitHub (Apr 8, 2025): > The memory estimation logic is receiving attention and there's always room for improvement, but fundamentally you don't have enough VRAM to host the model with the default parameters. There are ways to tweak those parameters as a linked above, but as I mentioned about splitting the model across multiple devices, the extra overhead involved with your setup is just going to make it hard to host mistral-small. Thanks, I want to point out that when I was running the Bartowski Q2_K quant of Mistral Small 3.1 from HF `ollama pull hf.co/bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF:Q2_K` it is able to work with context of 8192, flash attention on, and KV cache q8_0 with `ollama ps` showing that it automatically fit entirely into GPU without needing to add libraries to a PATH or setting num_gpu manually. To be clear: there seems to be something specific about having quantized the official Ollama version that's causing this edge case.
Author
Owner

@rick-github commented on GitHub (Apr 8, 2025):

The bartowski model doesn't do vision. The projector graph for vision support is large.

<!-- gh-comment-id:2786853865 --> @rick-github commented on GitHub (Apr 8, 2025): The bartowski model [doesn't do vision](https://huggingface.co/bartowski/mistralai_Mistral-Small-3.1-24B-Instruct-2503-GGUF/discussions/1). The projector graph for vision support is [large](https://github.com/ollama/ollama/issues/10167#issuecomment-2784917541).
Author
Owner

@lowlyocean commented on GitHub (Apr 8, 2025):

Thanks, understood. So then, for now if I set num_gpu manually to the max amount (41) I seem to get tokens/s nearly on par with the Bartowski model, and no OOM failures. Future visitors to this issue may want to consider doing that as a workaround

<!-- gh-comment-id:2786864423 --> @lowlyocean commented on GitHub (Apr 8, 2025): Thanks, understood. So then, for now if I set num_gpu manually to the max amount (41) I seem to get tokens/s nearly on par with the Bartowski model, and no OOM failures. Future visitors to this issue may want to consider doing that as a workaround
Author
Owner

@orrinwitt commented on GitHub (Apr 9, 2025):

I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's available with a combined ~28gb vram (12+8+8), and nvidia-smi shows literally none of the model being loaded to any of them. eventually I get output, since i have enough system ram, but that's obviously unusable output speed.

<!-- gh-comment-id:2790146529 --> @orrinwitt commented on GitHub (Apr 9, 2025): I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's available with a combined ~28gb vram (12+8+8), and nvidia-smi shows literally none of the model being loaded to any of them. eventually I get output, since i have enough system ram, but that's obviously unusable output speed.
Author
Owner

@metal3d commented on GitHub (Apr 9, 2025):

I had the same... And finally found that reducing context window size to
8096 made everything OK on multiple GPU.

Le mer. 9 avr. 2025, 17:38, Orrin Witt @.***> a écrit :

I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's
available with a combined ~28gb vram (12+8+8), and nvidia-smi shows
literally none of the model being loaded to any of them. eventually I get
output, since i have enough system ram, but that's obviously unusable
output speed.


Reply to this email directly, view it on GitHub
https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAYN4CM2P44K7QV24N3YS32YU5GVAVCNFSM6AAAAAB2UNNTOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGE2DMNJSHE
.
You are receiving this because you commented.Message ID:
@.***>
orrinwitt left a comment (ollama/ollama#10167)
https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529

I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's
available with a combined ~28gb vram (12+8+8), and nvidia-smi shows
literally none of the model being loaded to any of them. eventually I get
output, since i have enough system ram, but that's obviously unusable
output speed.


Reply to this email directly, view it on GitHub
https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAAYN4CM2P44K7QV24N3YS32YU5GVAVCNFSM6AAAAAB2UNNTOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGE2DMNJSHE
.
You are receiving this because you commented.Message ID:
@.***>

<!-- gh-comment-id:2790513677 --> @metal3d commented on GitHub (Apr 9, 2025): I had the same... And finally found that reducing context window size to 8096 made everything OK on multiple GPU. Le mer. 9 avr. 2025, 17:38, Orrin Witt ***@***.***> a écrit : > I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's > available with a combined ~28gb vram (12+8+8), and nvidia-smi shows > literally none of the model being loaded to any of them. eventually I get > output, since i have enough system ram, but that's obviously unusable > output speed. > > — > Reply to this email directly, view it on GitHub > <https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAAYN4CM2P44K7QV24N3YS32YU5GVAVCNFSM6AAAAAB2UNNTOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGE2DMNJSHE> > . > You are receiving this because you commented.Message ID: > ***@***.***> > *orrinwitt* left a comment (ollama/ollama#10167) > <https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529> > > I just wanted to chime in and say I've got the same issue. 3 nvidia gpu's > available with a combined ~28gb vram (12+8+8), and nvidia-smi shows > literally none of the model being loaded to any of them. eventually I get > output, since i have enough system ram, but that's obviously unusable > output speed. > > — > Reply to this email directly, view it on GitHub > <https://github.com/ollama/ollama/issues/10167#issuecomment-2790146529>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAAYN4CM2P44K7QV24N3YS32YU5GVAVCNFSM6AAAAAB2UNNTOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOJQGE2DMNJSHE> > . > You are receiving this because you commented.Message ID: > ***@***.***> >
Author
Owner

@orrinwitt commented on GitHub (Apr 18, 2025):

This behavior remains unchanged in 0.6.6-rc2

<!-- gh-comment-id:2816118348 --> @orrinwitt commented on GitHub (Apr 18, 2025): This behavior remains unchanged in 0.6.6-rc2
Author
Owner

@thot-experiment commented on GitHub (Apr 25, 2025):

I'm also getting this issue, 32gb GPU, cannot load mistral 3.1 q6k fully into vram, if I let ollama see my second GPU the model gets split taking a total of 24gb. Quanting the same model via ollama down to q5ks allows me to fit it in one gpu, but ollama ps reports 31gb vram used but nvidia-smi and task manager both agree that only 22gb of vram is used (including by other applications). Ram usage was tested during vision inference and the the model outputs confirm that the vision tower is working correctly.

This is a breaking bug, it prevents people from loading models that would fit into vram for no reason other than bad usage estimation. I'm using num_gpu 41 btw.

<!-- gh-comment-id:2831567663 --> @thot-experiment commented on GitHub (Apr 25, 2025): I'm also getting this issue, 32gb GPU, cannot load mistral 3.1 q6k fully into vram, if I let ollama see my second GPU the model gets split taking a total of 24gb. Quanting the same model via ollama down to q5ks allows me to fit it in one gpu, but `ollama ps` reports 31gb vram used but nvidia-smi and task manager both agree that only 22gb of vram is used (including by other applications). Ram usage was tested during vision inference and the the model outputs confirm that the vision tower is working correctly. This is a breaking bug, it prevents people from loading models that would fit into vram for no reason other than bad usage estimation. I'm using num_gpu 41 btw.
Author
Owner

@fbroussais commented on GitHub (May 11, 2025):

I have the same issue with RTX4060ti 16GB : ~3Gb in VRAM and ~11GB in CPU RAM with mistral-small3.1:latest (14GB)
ollama 0.6.8 on Win11
Other models with same size fits more in VRAM (90% for 12GB mistral-nemo:12b-instruct-2407-q8_0)

<!-- gh-comment-id:2869826828 --> @fbroussais commented on GitHub (May 11, 2025): I have the same issue with RTX4060ti 16GB : ~3Gb in VRAM and ~11GB in CPU RAM with mistral-small3.1:latest (14GB) ollama 0.6.8 on Win11 Other models with same size fits more in VRAM (90% for 12GB mistral-nemo:12b-instruct-2407-q8_0)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#32432