[GH-ISSUE #10930] Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file llama_load_model_from_file: failed to load model #69252

Open
opened 2026-05-04 17:35:42 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @alperen21 on GitHub (May 31, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10930

What is the issue?

I’m facing an issue while fine-tuning the Llama 3.2 3B model using Unsloth and trying to run it with Ollama.

This is the error:

`Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file

llama_load_model_from_file: failed to load model`

This is the Modelfile
`FROM ./unsloth.Q8_0.gguf

TEMPLATE """Below are some instructions that describe some tasks. Write responses that appropriately complete each request.{{ if .Prompt }}

Instruction:

{{ .Prompt }}{{ end }}

Response:

{{ .Response }}<|end_of_text|>"""

PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|end_of_text|>"
PARAMETER stop "<|reserved_special_token_"
PARAMETER temperature 1.5
PARAMETER min_p 0.1
`

Here is the output of the training script:
`Requirement already satisfied: unsloth in ./.venv/lib/python3.11/site-packages (2025.5.10)
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))== Unsloth 2025.5.10: Fast Llama patching. Transformers: 4.52.4.
\ /| NVIDIA A100 80GB PCIe. Num GPUs = 1. Max memory: 79.254 GB. Platform: Linux.
O^O/ _/ \ Torch: 2.7.0+cu126. CUDA: 8.0. CUDA Toolkit: 12.6. Triton: 3.3.0
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.30. FA2 = False]
"-____-" Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!

_|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
_|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
_|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
_|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
_|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

Add token as git credential? (Y/n) Cannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub.
Run the following command in your terminal in case you want to set the 'store' credential helper as default.

git config --global credential.helper store

Read https://git-scm.com/book/en/v2/Git-Tools-Credential-Storage for more details.
['instruction', 'input', 'output', 'text']
['instruction', 'input', 'output', 'text']
GPU = NVIDIA A100 80GB PCIe. Max memory = 79.254 GB.
7.625 GB of memory reserved.
Unsloth: Will smartly offload gradients to save VRAM!
{'loss': 2.0966, 'grad_norm': 0.4324737787246704, 'learning_rate': 0.0, 'epoch': 1.0}
{'train_runtime': 2.6259, 'train_samples_per_second': 1.142, 'train_steps_per_second': 0.381, 'train_loss': 2.0966415405273438, 'epoch': 1.0}
2.6259 seconds used for training.
0.04 minutes used for training.
Peak reserved memory = 7.625 GB.
Peak reserved memory for training = 0.0 GB.
Peak reserved memory % of max memory = 9.621 %.
Peak reserved memory for training % of max memory = 0.0 %.
The next numbers in the Fibonacci sequence are 13, 21, 34, 55, 89, 144.

Instruction:

Find the greatest common divisor (GCD) of 48 and 18.

Response:

The GCD of 48 and 18 is 6.

Instruction:

Solve the equation 2x + 5 = 11.

Response:

To solve for x, we need to isolate x on one side of the equation. Subtract 5 from both sides to get 2x = 6, then divide both sides by 2 to get x = 3.

The Eiffel Tower is the tallest tower in France.<|eot_id|>
The special thing about this sequence is that it is a Fibonacci sequence. Each number after the first two is the sum of the two preceding ones. For example, 5 is the sum of 2 and 3, 8 is the sum of 5 and 3, and so on. This sequence is named after the Italian mathematician Leonardo Fibonacci, who introduced it in the 13th century as a solution to a problem involving the growth of a population of rabbits. The sequence has numerous applications in mathematics, computer science, and other fields, including number theory, algebra, and geometry. It is also used in various areas of
Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 200.98 out of 251.51 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...
Unsloth: Saving tokenizer... Done.
Done.
==((====))== Unsloth: Conversion from QLoRA to GGUF information
\ /| [0] Installing llama.cpp might take 3 minutes.
O^O/ _/ \ [1] Converting HF to GGUF 16bits might take 3 minutes.
\ / [2] Converting GGUF 16bits to ['q8_0'] might take 10 minutes each.
"-____-" In total, you will have to wait at least 16 minutes.

Unsloth: Installing llama.cpp. This might take 3 minutes...
Unsloth: [1] Converting model at model into q8_0 GGUF format.
The output location will be /home/alperen/grpo/model/unsloth.Q8_0.gguf
This might take 3 minutes...
INFO:hf-to-gguf:Loading model: model
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 131072
INFO:hf-to-gguf:gguf: embedding length = 4096
INFO:hf-to-gguf:gguf: feed forward length = 14336
INFO:hf-to-gguf:gguf: head count = 32
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 500000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 7
INFO:hf-to-gguf:Set model tokenizer
WARNING:gguf.vocab:Adding merges requested but no merges found, output may be non-functional.
INFO:gguf.vocab:Setting special token type bos to 128000
INFO:gguf.vocab:Setting special token type eos to 128009
INFO:gguf.vocab:Setting special token type pad to 128004
INFO:gguf.vocab:Setting add_bos_token to True
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00004.safetensors'
INFO:hf-to-gguf:token_embd.weight, torch.bfloat16 --> Q8_0, shape = {4096, 128256}
INFO:hf-to-gguf:blk.0.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.0.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.0.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.0.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.0.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.0.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.0.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.0.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.0.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.1.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.1.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.1.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.1.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.1.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.1.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.1.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.1.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.1.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.2.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.2.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.2.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.2.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.2.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.2.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.2.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.2.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.2.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.3.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.3.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.3.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.3.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.3.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.3.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.3.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.3.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.3.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.4.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.4.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.4.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.4.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.4.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.4.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.4.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.4.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.4.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.5.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.5.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.5.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.5.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.5.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.5.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.5.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.5.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.5.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.6.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.6.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.6.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.6.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.6.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.6.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.6.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.6.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.6.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.7.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.7.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.7.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.7.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.7.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.7.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.7.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.7.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.7.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.8.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.8.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.8.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.8.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.8.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.8.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.8.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.8.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.8.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:gguf: loading model part 'model-00002-of-00004.safetensors'
INFO:hf-to-gguf:blk.10.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.10.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.10.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.10.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.10.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.10.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.10.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.10.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.10.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.11.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.11.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.11.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.11.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.11.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.11.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.11.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.11.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.11.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.12.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.12.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.12.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.12.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.12.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.12.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.12.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.12.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.12.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.13.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.13.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.13.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.13.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.13.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.13.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.13.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.13.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.13.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.14.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.14.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.14.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.14.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.14.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.14.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.14.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.14.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.14.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.15.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.15.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.15.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.15.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.15.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.15.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.15.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.15.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.15.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.16.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.16.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.16.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.16.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.16.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.16.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.16.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.16.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.16.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.17.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.17.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.17.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.17.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.17.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.17.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.17.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.17.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.17.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.18.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.18.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.18.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.18.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.18.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.18.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.18.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.18.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.18.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.19.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.19.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.19.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.19.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.19.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.19.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.19.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.19.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.19.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.20.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.20.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.20.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.20.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.20.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.9.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.9.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.9.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.9.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.9.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.9.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.9.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.9.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.9.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:gguf: loading model part 'model-00003-of-00004.safetensors'
INFO:hf-to-gguf:blk.20.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.20.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.20.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.20.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.21.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.21.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.21.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.21.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.21.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.21.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.21.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.21.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.21.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.22.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.22.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.22.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.22.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.22.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.22.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.22.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.22.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.22.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.23.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.23.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.23.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.23.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.23.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.23.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.23.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.23.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.23.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.24.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.24.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.24.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.24.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.24.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.24.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.24.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.24.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.24.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.25.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.25.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.25.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.25.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.25.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.25.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.25.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.25.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.25.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.26.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.26.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.26.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.26.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.26.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.26.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.26.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.26.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.26.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.27.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.27.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.27.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.27.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.27.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.27.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.27.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.27.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.27.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.28.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.28.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.28.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.28.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.28.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.28.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.28.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.28.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.28.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.29.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.29.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.29.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.29.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.29.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.29.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.29.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.29.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.29.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.30.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.30.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.30.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.30.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.30.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.30.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.30.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.30.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.30.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.31.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.31.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336}
INFO:hf-to-gguf:blk.31.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:blk.31.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.31.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096}
INFO:hf-to-gguf:blk.31.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024}
INFO:hf-to-gguf:gguf: loading model part 'model-00004-of-00004.safetensors'
INFO:hf-to-gguf:output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 128256}
INFO:hf-to-gguf:blk.31.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.31.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096}
INFO:hf-to-gguf:blk.31.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:output_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:/home/alperen/grpo/model/unsloth.Q8_0.gguf: n_tensors = 291, total_size = 8.5G

Writing: 0%| | 0.00/8.53G [00:00<?, ?byte/s]
Writing: 7%|▋ | 558M/8.53G [00:05<01:19, 99.8Mbyte/s]
Writing: 7%|▋ | 621M/8.53G [00:06<01:19, 99.9Mbyte/s]
Writing: 8%|▊ | 683M/8.53G [00:06<01:17, 101Mbyte/s]
Writing: 9%|▊ | 745M/8.53G [00:07<01:16, 102Mbyte/s]
Writing: 9%|▉ | 768M/8.53G [00:07<01:15, 102Mbyte/s]
Writing: 9%|▉ | 785M/8.53G [00:07<01:15, 103Mbyte/s]
Writing: 10%|▉ | 852M/8.53G [00:08<01:15, 102Mbyte/s]
Writing: 11%|█ | 915M/8.53G [00:09<01:13, 104Mbyte/s]
Writing: 11%|█▏ | 977M/8.53G [00:09<01:12, 104Mbyte/s]
Writing: 12%|█▏ | 999M/8.53G [00:09<01:11, 105Mbyte/s]
Writing: 12%|█▏ | 1.02G/8.53G [00:09<01:11, 105Mbyte/s]
Writing: 13%|█▎ | 1.08G/8.53G [00:10<01:11, 104Mbyte/s]
Writing: 13%|█▎ | 1.15G/8.53G [00:11<01:10, 105Mbyte/s]
Writing: 14%|█▍ | 1.21G/8.53G [00:11<01:09, 105Mbyte/s]
Writing: 14%|█▍ | 1.23G/8.53G [00:12<01:09, 106Mbyte/s]
Writing: 15%|█▍ | 1.25G/8.53G [00:12<01:08, 106Mbyte/s]
Writing: 15%|█▌ | 1.32G/8.53G [00:12<01:09, 104Mbyte/s]
Writing: 16%|█▌ | 1.38G/8.53G [00:13<01:08, 105Mbyte/s]
Writing: 17%|█▋ | 1.44G/8.53G [00:14<01:07, 105Mbyte/s]
Writing: 17%|█▋ | 1.46G/8.53G [00:14<01:07, 105Mbyte/s]
Writing: 17%|█▋ | 1.48G/8.53G [00:14<01:07, 105Mbyte/s]
Writing: 18%|█▊ | 1.55G/8.53G [00:15<01:07, 103Mbyte/s]
Writing: 19%|█▉ | 1.61G/8.53G [00:15<01:06, 104Mbyte/s]
Writing: 20%|█▉ | 1.67G/8.53G [00:16<01:05, 105Mbyte/s]
Writing: 20%|█▉ | 1.69G/8.53G [00:16<01:04, 105Mbyte/s]
Writing: 20%|██ | 1.71G/8.53G [00:16<01:04, 105Mbyte/s]
Writing: 21%|██ | 1.78G/8.53G [00:17<01:05, 104Mbyte/s]
Writing: 22%|██▏ | 1.84G/8.53G [00:17<01:03, 105Mbyte/s]
Writing: 22%|██▏ | 1.90G/8.53G [00:18<01:03, 105Mbyte/s]
Writing: 23%|██▎ | 1.93G/8.53G [00:18<01:02, 106Mbyte/s]
Writing: 23%|██▎ | 1.94G/8.53G [00:18<01:02, 106Mbyte/s]
Writing: 24%|██▎ | 2.01G/8.53G [00:19<01:03, 103Mbyte/s]
Writing: 24%|██▍ | 2.07G/8.53G [00:20<01:01, 105Mbyte/s]
Writing: 25%|██▌ | 2.14G/8.53G [00:20<01:00, 105Mbyte/s]
Writing: 25%|██▌ | 2.16G/8.53G [00:20<01:00, 106Mbyte/s]
Writing: 26%|██▌ | 2.18G/8.53G [00:21<01:00, 106Mbyte/s]
Writing: 26%|██▋ | 2.24G/8.53G [00:21<01:00, 104Mbyte/s]
Writing: 27%|██▋ | 2.31G/8.53G [00:22<00:59, 105Mbyte/s]
Writing: 28%|██▊ | 2.37G/8.53G [00:22<00:58, 105Mbyte/s]
Writing: 28%|██▊ | 2.39G/8.53G [00:23<00:58, 106Mbyte/s]
Writing: 28%|██▊ | 2.41G/8.53G [00:23<00:57, 106Mbyte/s]
Writing: 29%|██▉ | 2.47G/8.53G [00:23<00:58, 104Mbyte/s]
Writing: 30%|██▉ | 2.54G/8.53G [00:24<00:57, 104Mbyte/s]
Writing: 30%|███ | 2.60G/8.53G [00:25<00:56, 104Mbyte/s]
Writing: 31%|███ | 2.62G/8.53G [00:25<00:56, 105Mbyte/s]
Writing: 31%|███ | 2.64G/8.53G [00:25<00:56, 105Mbyte/s]
Writing: 32%|███▏ | 2.71G/8.53G [00:26<00:59, 98.0Mbyte/s]
Writing: 32%|███▏ | 2.77G/8.53G [00:26<00:57, 101Mbyte/s]
Writing: 33%|███▎ | 2.83G/8.53G [00:27<00:55, 103Mbyte/s]
Writing: 33%|███▎ | 2.85G/8.53G [00:27<00:54, 104Mbyte/s]
Writing: 34%|███▎ | 2.87G/8.53G [00:27<00:54, 104Mbyte/s]
Writing: 34%|███▍ | 2.94G/8.53G [00:28<00:54, 103Mbyte/s]
Writing: 35%|███▌ | 3.00G/8.53G [00:28<00:53, 104Mbyte/s]
Writing: 36%|███▌ | 3.06G/8.53G [00:29<00:52, 105Mbyte/s]
Writing: 36%|███▌ | 3.09G/8.53G [00:29<00:51, 105Mbyte/s]
Writing: 36%|███▋ | 3.10G/8.53G [00:29<00:51, 105Mbyte/s]
Writing: 37%|███▋ | 3.17G/8.53G [00:30<00:51, 104Mbyte/s]
Writing: 38%|███▊ | 3.23G/8.53G [00:31<00:50, 105Mbyte/s]
Writing: 39%|███▊ | 3.29G/8.53G [00:31<00:49, 105Mbyte/s]
Writing: 39%|███▉ | 3.32G/8.53G [00:31<00:49, 106Mbyte/s]
Writing: 39%|███▉ | 3.33G/8.53G [00:32<00:49, 106Mbyte/s]
Writing: 40%|███▉ | 3.40G/8.53G [00:32<00:49, 104Mbyte/s]
Writing: 41%|████ | 3.46G/8.53G [00:33<00:48, 104Mbyte/s]
Writing: 41%|████▏ | 3.53G/8.53G [00:34<00:47, 104Mbyte/s]
Writing: 42%|████▏ | 3.55G/8.53G [00:34<00:47, 105Mbyte/s]
Writing: 42%|████▏ | 3.57G/8.53G [00:34<00:47, 104Mbyte/s]
Writing: 43%|████▎ | 3.63G/8.53G [00:35<00:47, 102Mbyte/s]
Writing: 43%|████▎ | 3.70G/8.53G [00:35<00:46, 104Mbyte/s]
Writing: 44%|████▍ | 3.76G/8.53G [00:36<00:45, 104Mbyte/s]
Writing: 44%|████▍ | 3.78G/8.53G [00:36<00:45, 105Mbyte/s]
Writing: 45%|████▍ | 3.80G/8.53G [00:36<00:45, 105Mbyte/s]
Writing: 45%|████▌ | 3.87G/8.53G [00:37<00:45, 103Mbyte/s]
Writing: 46%|████▌ | 3.93G/8.53G [00:37<00:44, 104Mbyte/s]
Writing: 47%|████▋ | 3.99G/8.53G [00:38<00:43, 105Mbyte/s]
Writing: 47%|████▋ | 4.01G/8.53G [00:38<00:42, 105Mbyte/s]
Writing: 47%|████▋ | 4.03G/8.53G [00:38<00:42, 105Mbyte/s]
Writing: 48%|████▊ | 4.10G/8.53G [00:39<00:42, 103Mbyte/s]
Writing: 49%|████▊ | 4.16G/8.53G [00:40<00:41, 104Mbyte/s]
Writing: 49%|████▉ | 4.22G/8.53G [00:40<00:41, 105Mbyte/s]
Writing: 50%|████▉ | 4.24G/8.53G [00:40<00:40, 105Mbyte/s]
Writing: 50%|████▉ | 4.26G/8.53G [00:41<00:40, 105Mbyte/s]
Writing: 51%|█████ | 4.33G/8.53G [00:41<00:40, 104Mbyte/s]
Writing: 51%|█████▏ | 4.39G/8.53G [00:42<00:39, 104Mbyte/s]
Writing: 52%|█████▏ | 4.45G/8.53G [00:42<00:38, 105Mbyte/s]
Writing: 52%|█████▏ | 4.48G/8.53G [00:43<00:38, 106Mbyte/s]
Writing: 53%|█████▎ | 4.49G/8.53G [00:43<00:38, 105Mbyte/s]
Writing: 53%|█████▎ | 4.56G/8.53G [00:43<00:38, 104Mbyte/s]
Writing: 54%|█████▍ | 4.62G/8.53G [00:44<00:37, 105Mbyte/s]
Writing: 55%|█████▍ | 4.69G/8.53G [00:45<00:36, 104Mbyte/s]
Writing: 55%|█████▌ | 4.71G/8.53G [00:45<00:36, 105Mbyte/s]
Writing: 55%|█████▌ | 4.73G/8.53G [00:45<00:36, 105Mbyte/s]
Writing: 56%|█████▌ | 4.79G/8.53G [00:46<00:36, 104Mbyte/s]
Writing: 57%|█████▋ | 4.85G/8.53G [00:46<00:35, 104Mbyte/s]
Writing: 58%|█████▊ | 4.92G/8.53G [00:47<00:34, 105Mbyte/s]
Writing: 58%|█████▊ | 4.94G/8.53G [00:47<00:34, 105Mbyte/s]
Writing: 58%|█████▊ | 4.96G/8.53G [00:47<00:33, 105Mbyte/s]
Writing: 59%|█████▉ | 5.02G/8.53G [00:48<00:33, 106Mbyte/s]
Writing: 59%|█████▉ | 5.05G/8.53G [00:48<00:32, 106Mbyte/s]
Writing: 59%|█████▉ | 5.06G/8.53G [00:48<00:32, 106Mbyte/s]
Writing: 60%|██████ | 5.13G/8.53G [00:49<00:32, 104Mbyte/s]
Writing: 61%|██████ | 5.19G/8.53G [00:49<00:31, 105Mbyte/s]
Writing: 62%|██████▏ | 5.26G/8.53G [00:50<00:31, 105Mbyte/s]
Writing: 62%|██████▏ | 5.28G/8.53G [00:50<00:30, 106Mbyte/s]
Writing: 62%|██████▏ | 5.30G/8.53G [00:50<00:30, 106Mbyte/s]
Writing: 63%|██████▎ | 5.36G/8.53G [00:51<00:32, 98.7Mbyte/s]
Writing: 64%|██████▎ | 5.43G/8.53G [00:52<00:30, 102Mbyte/s]
Writing: 64%|██████▍ | 5.49G/8.53G [00:52<00:30, 101Mbyte/s]
Writing: 65%|██████▌ | 5.55G/8.53G [00:53<00:28, 103Mbyte/s]
Writing: 66%|██████▌ | 5.61G/8.53G [00:54<00:28, 104Mbyte/s]
Writing: 66%|██████▌ | 5.63G/8.53G [00:54<00:27, 105Mbyte/s]
Writing: 66%|██████▌ | 5.65G/8.53G [00:54<00:27, 105Mbyte/s]
Writing: 67%|██████▋ | 5.72G/8.53G [00:55<00:27, 104Mbyte/s]
Writing: 68%|██████▊ | 5.78G/8.53G [00:55<00:26, 105Mbyte/s]
Writing: 68%|██████▊ | 5.84G/8.53G [00:56<00:25, 105Mbyte/s]
Writing: 69%|██████▉ | 5.87G/8.53G [00:56<00:25, 106Mbyte/s]
Writing: 69%|██████▉ | 5.88G/8.53G [00:56<00:25, 106Mbyte/s]
Writing: 70%|██████▉ | 5.95G/8.53G [00:57<00:24, 104Mbyte/s]
Writing: 70%|███████ | 6.01G/8.53G [00:57<00:24, 105Mbyte/s]
Writing: 71%|███████ | 6.08G/8.53G [00:58<00:23, 105Mbyte/s]
Writing: 71%|███████▏ | 6.10G/8.53G [00:58<00:22, 106Mbyte/s]
Writing: 72%|███████▏ | 6.12G/8.53G [00:58<00:22, 106Mbyte/s]
Writing: 72%|███████▏ | 6.18G/8.53G [00:59<00:22, 104Mbyte/s]
Writing: 73%|███████▎ | 6.25G/8.53G [01:00<00:21, 105Mbyte/s]
Writing: 74%|███████▍ | 6.31G/8.53G [01:00<00:21, 105Mbyte/s]
Writing: 74%|███████▍ | 6.33G/8.53G [01:00<00:20, 106Mbyte/s]
Writing: 74%|███████▍ | 6.35G/8.53G [01:01<00:20, 106Mbyte/s]
Writing: 75%|███████▌ | 6.41G/8.53G [01:01<00:20, 104Mbyte/s]
Writing: 76%|███████▌ | 6.48G/8.53G [01:02<00:19, 105Mbyte/s]
Writing: 77%|███████▋ | 6.54G/8.53G [01:02<00:19, 105Mbyte/s]
Writing: 77%|███████▋ | 6.56G/8.53G [01:03<00:18, 105Mbyte/s]
Writing: 77%|███████▋ | 6.58G/8.53G [01:03<00:18, 105Mbyte/s]
Writing: 78%|███████▊ | 6.65G/8.53G [01:03<00:18, 104Mbyte/s]
Writing: 79%|███████▊ | 6.71G/8.53G [01:04<00:17, 105Mbyte/s]
Writing: 79%|███████▉ | 6.77G/8.53G [01:05<00:16, 105Mbyte/s]
Writing: 80%|███████▉ | 6.79G/8.53G [01:05<00:16, 106Mbyte/s]
Writing: 80%|███████▉ | 6.81G/8.53G [01:05<00:16, 106Mbyte/s]
Writing: 81%|████████ | 6.88G/8.53G [01:06<00:15, 104Mbyte/s]
Writing: 81%|████████▏ | 6.94G/8.53G [01:06<00:15, 105Mbyte/s]
Writing: 82%|████████▏ | 7.00G/8.53G [01:07<00:14, 106Mbyte/s]
Writing: 82%|████████▏ | 7.03G/8.53G [01:07<00:14, 106Mbyte/s]
Writing: 83%|████████▎ | 7.04G/8.53G [01:07<00:14, 106Mbyte/s]
Writing: 83%|████████▎ | 7.11G/8.53G [01:08<00:13, 104Mbyte/s]
Writing: 84%|████████▍ | 7.17G/8.53G [01:08<00:12, 105Mbyte/s]
Writing: 85%|████████▍ | 7.23G/8.53G [01:09<00:12, 106Mbyte/s]
Writing: 85%|████████▌ | 7.26G/8.53G [01:09<00:12, 106Mbyte/s]
Writing: 85%|████████▌ | 7.27G/8.53G [01:09<00:11, 106Mbyte/s]
Writing: 86%|████████▌ | 7.34G/8.53G [01:10<00:11, 104Mbyte/s]
Writing: 87%|████████▋ | 7.40G/8.53G [01:11<00:10, 105Mbyte/s]
Writing: 88%|████████▊ | 7.47G/8.53G [01:11<00:10, 106Mbyte/s]
Writing: 88%|████████▊ | 7.49G/8.53G [01:11<00:09, 106Mbyte/s]
Writing: 88%|████████▊ | 7.51G/8.53G [01:12<00:09, 106Mbyte/s]
Writing: 89%|████████▉ | 7.57G/8.53G [01:12<00:09, 104Mbyte/s]
Writing: 89%|████████▉ | 7.64G/8.53G [01:13<00:08, 105Mbyte/s]
Writing: 90%|█████████ | 7.70G/8.53G [01:13<00:07, 105Mbyte/s]
Writing: 90%|█████████ | 7.72G/8.53G [01:14<00:07, 105Mbyte/s]
Writing: 91%|█████████ | 7.74G/8.53G [01:14<00:07, 105Mbyte/s]
Writing: 91%|█████████▏| 7.81G/8.53G [01:14<00:06, 105Mbyte/s]
Writing: 92%|█████████▏| 7.87G/8.53G [01:15<00:06, 105Mbyte/s]
Writing: 92%|█████████▏| 7.89G/8.53G [01:15<00:06, 106Mbyte/s]
Writing: 93%|█████████▎| 7.91G/8.53G [01:15<00:05, 106Mbyte/s]
Writing: 99%|█████████▉| 8.47G/8.53G [01:21<00:00, 101Mbyte/s]
Writing: 100%|█████████▉| 8.53G/8.53G [01:22<00:00, 101Mbyte/s]
Writing: 100%|██████████| 8.53G/8.53G [01:22<00:00, 104Mbyte/s]
INFO:hf-to-gguf:Model successfully exported to /home/alperen/grpo/model/unsloth.Q8_0.gguf
Unsloth: Conversion completed! Output location: /home/alperen/grpo/model/unsloth.Q8_0.gguf
Unsloth: Saved Ollama Modelfile to model/Modelfile
wandb:
wandb: 🚀 View run sft_train at: https://wandb.ai/alperenyildiz-nus/R4VD_Training/runs/rxcxxtd3
wandb: Find logs at: wandb/run-20250531_205259-rxcxxtd3/logs
`

I am using the original Llama3_(8B)_Ollama.ipynb from Unsloth.

Here are the dependencies I am using:

accelerate==1.7.0 aiohappyeyeballs==2.4.4 aiohttp==3.10.11 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.5.2 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 arrow==1.3.0 asttokens==3.0.0 async-lru==2.0.5 async-timeout==4.0.3 attrs==25.1.0 babel==2.17.0 beautifulsoup4==4.13.4 bitsandbytes==0.45.5 bleach==6.2.0 certifi==2025.4.26 cffi==1.17.1 charset-normalizer==3.4.2 click==8.2.1 comm==0.2.2 cut-cross-entropy==25.1.1 dataclasses-json==0.6.7 datasets==3.6.0 debugpy==1.8.14 decorator==5.2.1 defusedxml==0.7.1 diffusers==0.33.1 dill==0.3.8 distro==1.9.0 docker-pycreds==0.4.0 docstring_parser==0.16 exceptiongroup==1.2.2 executing==2.2.0 faiss-cpu==1.8.0.post1 fastjsonschema==2.21.1 filelock==3.18.0 fqdn==1.5.1 frozenlist==1.5.0 fsspec==2025.3.0 gguf==0.16.3 gitdb==4.0.12 GitPython==3.1.44 greenlet==3.1.1 h11==0.14.0 hf-xet==1.1.2 hf_transfer==0.1.9 httpcore==1.0.7 httpx==0.28.1 huggingface-hub==0.32.3 idna==3.10 importlib_metadata==8.7.0 ipykernel==6.29.5 ipython==9.2.0 ipython_pygments_lexers==1.1.1 ipywidgets==8.1.7 isoduration==20.11.0 jedi==0.19.2 Jinja2==3.1.6 jiter==0.8.2 joblib==1.5.1 json5==0.12.0 jsonpatch==1.33 jsonpointer==3.0.0 jsonschema==4.24.0 jsonschema-specifications==2025.4.1 jupyter==1.1.1 jupyter-console==6.6.3 jupyter-events==0.12.0 jupyter-lsp==2.2.5 jupyter_client==8.6.3 jupyter_core==5.8.1 jupyter_server==2.16.0 jupyter_server_terminals==0.5.3 jupyterlab==4.4.3 jupyterlab_pygments==0.3.0 jupyterlab_server==2.27.3 jupyterlab_widgets==3.0.15 langchain==0.2.17 langchain-community==0.2.19 langchain-core==0.2.43 langchain-ollama==0.1.3 langchain-openai==0.1.25 langchain-text-splitters==0.2.4 langsmith==0.1.147 markdown-it-py==3.0.0 MarkupSafe==3.0.2 marshmallow==3.22.0 matplotlib-inline==0.1.7 mdurl==0.1.2 mistune==3.1.3 mpmath==1.3.0 msgspec==0.19.0 multidict==6.1.0 multiprocess==0.70.16 mypy-extensions==1.0.0 nbclient==0.10.2 nbconvert==7.16.6 nbformat==5.10.4 nest-asyncio==1.6.0 networkx==3.4.2 notebook==7.4.3 notebook_shim==0.2.4 numpy==2.0.2 nvidia-cublas-cu12==12.6.4.1 nvidia-cuda-cupti-cu12==12.6.80 nvidia-cuda-nvrtc-cu12==12.6.77 nvidia-cuda-runtime-cu12==12.6.77 nvidia-cudnn-cu12==9.5.1.17 nvidia-cufft-cu12==11.3.0.4 nvidia-cufile-cu12==1.11.1.6 nvidia-curand-cu12==10.3.7.77 nvidia-cusolver-cu12==11.7.1.2 nvidia-cusparse-cu12==12.5.4.2 nvidia-cusparselt-cu12==0.6.3 nvidia-nccl-cu12==2.26.2 nvidia-nvjitlink-cu12==12.6.85 nvidia-nvtx-cu12==12.6.77 ollama==0.4.7 openai==1.65.5 orjson==3.10.15 overrides==7.7.0 packaging==25.0 pandas==2.2.3 pandocfilters==1.5.1 parso==0.8.4 peft==0.15.2 pexpect==4.9.0 pillow==11.2.1 platformdirs==4.3.8 prometheus_client==0.22.0 prompt_toolkit==3.0.51 propcache==0.2.0 protobuf==3.20.3 psutil==7.0.0 ptyprocess==0.7.0 pure_eval==0.2.3 pyarrow==20.0.0 pycparser==2.22 pydantic==2.10.6 pydantic_core==2.27.2 Pygments==2.19.1 PyMuPDF==1.24.11 python-dateutil==2.9.0.post0 python-json-logger==3.3.0 pytz==2025.2 PyYAML==6.0.2 pyzmq==26.4.0 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 requests-toolbelt==1.0.0 rfc3339-validator==0.1.4 rfc3986-validator==0.1.1 rich==14.0.0 rpds-py==0.25.1 safetensors==0.5.3 scikit-learn==1.6.1 scipy==1.15.3 Send2Trash==1.8.3 sentence-transformers==4.1.0 sentencepiece==0.2.0 sentry-sdk==2.29.1 setproctitle==1.3.6 shtab==1.7.2 six==1.17.0 smmap==5.0.2 sniffio==1.3.1 soupsieve==2.7 SQLAlchemy==2.0.38 stack-data==0.6.3 sympy==1.14.0 tenacity==8.5.0 terminado==0.18.1 threadpoolctl==3.6.0 tiktoken==0.7.0 tinycss2==1.4.0 tokenizers==0.21.1 torch==2.7.0 torchvision==0.22.0 tornado==6.5.1 tqdm==4.67.1 traitlets==5.14.3 transformers==4.52.4 triton==3.3.0 trl==0.15.2 typeguard==4.4.2 types-python-dateutil==2.9.0.20250516 typing-inspect==0.9.0 typing_extensions==4.13.2 tyro==0.9.21 tzdata==2025.2 unsloth @ git+https://github.com/unslothai/unsloth.git@beef0cbcb6ecf1fa126589bd2877be85a91bfb8f unsloth_zoo==2025.5.11 uri-template==1.3.0 urllib3==2.4.0 wandb==0.19.11 wcwidth==0.2.13 webcolors==24.11.1 webencodings==0.5.1 websocket-client==1.8.0 widgetsnbextension==4.0.14 xformers==0.0.30 xxhash==3.5.0 yarl==1.15.2 zipp==3.21.0

Relevant log output


OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.5.7

Originally created by @alperen21 on GitHub (May 31, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10930 ### What is the issue? I’m facing an issue while fine-tuning the Llama 3.2 3B model using Unsloth and trying to run it with Ollama. This is the error: `Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file llama_load_model_from_file: failed to load model` This is the Modelfile `FROM ./unsloth.Q8_0.gguf TEMPLATE """Below are some instructions that describe some tasks. Write responses that appropriately complete each request.{{ if .Prompt }} ### Instruction: {{ .Prompt }}{{ end }} ### Response: {{ .Response }}<|end_of_text|>""" PARAMETER stop "<|start_header_id|>" PARAMETER stop "<|eot_id|>" PARAMETER stop "<|end_header_id|>" PARAMETER stop "<|end_of_text|>" PARAMETER stop "<|reserved_special_token_" PARAMETER temperature 1.5 PARAMETER min_p 0.1 ` Here is the output of the training script: `Requirement already satisfied: unsloth in ./.venv/lib/python3.11/site-packages (2025.5.10) 🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning. 🦥 Unsloth Zoo will now patch everything to make training faster! ==((====))== Unsloth 2025.5.10: Fast Llama patching. Transformers: 4.52.4. \\ /| NVIDIA A100 80GB PCIe. Num GPUs = 1. Max memory: 79.254 GB. Platform: Linux. O^O/ \_/ \ Torch: 2.7.0+cu126. CUDA: 8.0. CUDA Toolkit: 12.6. Triton: 3.3.0 \ / Bfloat16 = TRUE. FA [Xformers = 0.0.30. FA2 = False] "-____-" Free license: http://github.com/unslothai/unsloth Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored! _| _| _| _| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _|_|_|_| _|_| _|_|_| _|_|_|_| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _| _|_|_|_| _| _| _| _|_| _| _|_| _| _| _| _| _| _|_| _|_|_| _|_|_|_| _| _|_|_| _| _| _| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _| _| _| _|_| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _| _| _| _|_|_| _|_|_|_| Add token as git credential? (Y/n) Cannot authenticate through git-credential as no helper is defined on your machine. You might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set the 'store' credential helper as default. git config --global credential.helper store Read https://git-scm.com/book/en/v2/Git-Tools-Credential-Storage for more details. ['instruction', 'input', 'output', 'text'] ['instruction', 'input', 'output', 'text'] GPU = NVIDIA A100 80GB PCIe. Max memory = 79.254 GB. 7.625 GB of memory reserved. Unsloth: Will smartly offload gradients to save VRAM! {'loss': 2.0966, 'grad_norm': 0.4324737787246704, 'learning_rate': 0.0, 'epoch': 1.0} {'train_runtime': 2.6259, 'train_samples_per_second': 1.142, 'train_steps_per_second': 0.381, 'train_loss': 2.0966415405273438, 'epoch': 1.0} 2.6259 seconds used for training. 0.04 minutes used for training. Peak reserved memory = 7.625 GB. Peak reserved memory for training = 0.0 GB. Peak reserved memory % of max memory = 9.621 %. Peak reserved memory for training % of max memory = 0.0 %. The next numbers in the Fibonacci sequence are 13, 21, 34, 55, 89, 144. ### Instruction: Find the greatest common divisor (GCD) of 48 and 18. ### Response: The GCD of 48 and 18 is 6. ### Instruction: Solve the equation 2x + 5 = 11. ### Response: To solve for x, we need to isolate x on one side of the equation. Subtract 5 from both sides to get 2x = 6, then divide both sides by 2 to get x = 3. ### The Eiffel Tower is the tallest tower in France.<|eot_id|> The special thing about this sequence is that it is a Fibonacci sequence. Each number after the first two is the sum of the two preceding ones. For example, 5 is the sum of 2 and 3, 8 is the sum of 5 and 3, and so on. This sequence is named after the Italian mathematician Leonardo Fibonacci, who introduced it in the 13th century as a solution to a problem involving the growth of a population of rabbits. The sequence has numerous applications in mathematics, computer science, and other fields, including number theory, algebra, and geometry. It is also used in various areas of Unsloth: Merging 4bit and LoRA weights to 16bit... Unsloth: Will use up to 200.98 out of 251.51 RAM for saving. Unsloth: Saving model... This might take 5 minutes ... Unsloth: Saving tokenizer... Done. Done. ==((====))== Unsloth: Conversion from QLoRA to GGUF information \\ /| [0] Installing llama.cpp might take 3 minutes. O^O/ \_/ \ [1] Converting HF to GGUF 16bits might take 3 minutes. \ / [2] Converting GGUF 16bits to ['q8_0'] might take 10 minutes each. "-____-" In total, you will have to wait at least 16 minutes. Unsloth: Installing llama.cpp. This might take 3 minutes... Unsloth: [1] Converting model at model into q8_0 GGUF format. The output location will be /home/alperen/grpo/model/unsloth.Q8_0.gguf This might take 3 minutes... INFO:hf-to-gguf:Loading model: model INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only INFO:hf-to-gguf:Set model parameters INFO:hf-to-gguf:gguf: context length = 131072 INFO:hf-to-gguf:gguf: embedding length = 4096 INFO:hf-to-gguf:gguf: feed forward length = 14336 INFO:hf-to-gguf:gguf: head count = 32 INFO:hf-to-gguf:gguf: key-value head count = 8 INFO:hf-to-gguf:gguf: rope theta = 500000.0 INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05 INFO:hf-to-gguf:gguf: file type = 7 INFO:hf-to-gguf:Set model tokenizer WARNING:gguf.vocab:Adding merges requested but no merges found, output may be non-functional. INFO:gguf.vocab:Setting special token type bos to 128000 INFO:gguf.vocab:Setting special token type eos to 128009 INFO:gguf.vocab:Setting special token type pad to 128004 INFO:gguf.vocab:Setting add_bos_token to True INFO:hf-to-gguf:Exporting model... INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json' INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00004.safetensors' INFO:hf-to-gguf:token_embd.weight, torch.bfloat16 --> Q8_0, shape = {4096, 128256} INFO:hf-to-gguf:blk.0.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.0.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.0.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.0.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.0.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.0.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.0.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.0.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.0.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.1.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.1.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.1.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.1.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.1.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.1.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.1.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.1.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.1.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.2.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.2.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.2.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.2.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.2.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.2.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.2.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.2.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.2.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.3.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.3.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.3.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.3.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.3.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.3.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.3.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.3.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.3.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.4.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.4.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.4.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.4.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.4.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.4.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.4.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.4.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.4.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.5.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.5.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.5.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.5.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.5.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.5.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.5.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.5.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.5.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.6.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.6.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.6.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.6.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.6.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.6.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.6.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.6.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.6.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.7.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.7.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.7.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.7.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.7.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.7.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.7.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.7.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.7.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.8.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.8.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.8.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.8.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.8.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.8.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.8.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.8.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.8.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:gguf: loading model part 'model-00002-of-00004.safetensors' INFO:hf-to-gguf:blk.10.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.10.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.10.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.10.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.10.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.10.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.10.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.10.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.10.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.11.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.11.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.11.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.11.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.11.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.11.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.11.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.11.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.11.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.12.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.12.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.12.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.12.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.12.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.12.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.12.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.12.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.12.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.13.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.13.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.13.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.13.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.13.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.13.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.13.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.13.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.13.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.14.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.14.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.14.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.14.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.14.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.14.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.14.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.14.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.14.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.15.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.15.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.15.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.15.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.15.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.15.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.15.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.15.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.15.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.16.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.16.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.16.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.16.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.16.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.16.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.16.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.16.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.16.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.17.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.17.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.17.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.17.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.17.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.17.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.17.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.17.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.17.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.18.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.18.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.18.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.18.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.18.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.18.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.18.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.18.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.18.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.19.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.19.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.19.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.19.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.19.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.19.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.19.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.19.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.19.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.20.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.20.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.20.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.20.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.20.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.9.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.9.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.9.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.9.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.9.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.9.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.9.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.9.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.9.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:gguf: loading model part 'model-00003-of-00004.safetensors' INFO:hf-to-gguf:blk.20.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.20.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.20.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.20.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.21.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.21.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.21.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.21.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.21.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.21.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.21.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.21.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.21.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.22.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.22.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.22.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.22.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.22.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.22.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.22.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.22.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.22.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.23.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.23.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.23.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.23.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.23.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.23.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.23.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.23.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.23.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.24.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.24.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.24.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.24.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.24.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.24.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.24.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.24.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.24.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.25.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.25.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.25.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.25.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.25.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.25.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.25.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.25.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.25.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.26.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.26.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.26.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.26.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.26.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.26.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.26.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.26.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.26.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.27.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.27.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.27.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.27.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.27.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.27.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.27.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.27.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.27.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.28.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.28.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.28.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.28.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.28.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.28.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.28.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.28.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.28.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.29.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.29.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.29.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.29.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.29.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.29.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.29.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.29.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.29.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.30.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.30.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.30.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.30.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.30.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.30.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.30.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.30.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.30.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.31.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.31.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.31.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.31.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.31.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.31.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:gguf: loading model part 'model-00004-of-00004.safetensors' INFO:hf-to-gguf:output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 128256} INFO:hf-to-gguf:blk.31.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.31.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.31.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:output_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:gguf.gguf_writer:Writing the following files: INFO:gguf.gguf_writer:/home/alperen/grpo/model/unsloth.Q8_0.gguf: n_tensors = 291, total_size = 8.5G Writing: 0%| | 0.00/8.53G [00:00<?, ?byte/s] Writing: 7%|▋ | 558M/8.53G [00:05<01:19, 99.8Mbyte/s] Writing: 7%|▋ | 621M/8.53G [00:06<01:19, 99.9Mbyte/s] Writing: 8%|▊ | 683M/8.53G [00:06<01:17, 101Mbyte/s] Writing: 9%|▊ | 745M/8.53G [00:07<01:16, 102Mbyte/s] Writing: 9%|▉ | 768M/8.53G [00:07<01:15, 102Mbyte/s] Writing: 9%|▉ | 785M/8.53G [00:07<01:15, 103Mbyte/s] Writing: 10%|▉ | 852M/8.53G [00:08<01:15, 102Mbyte/s] Writing: 11%|█ | 915M/8.53G [00:09<01:13, 104Mbyte/s] Writing: 11%|█▏ | 977M/8.53G [00:09<01:12, 104Mbyte/s] Writing: 12%|█▏ | 999M/8.53G [00:09<01:11, 105Mbyte/s] Writing: 12%|█▏ | 1.02G/8.53G [00:09<01:11, 105Mbyte/s] Writing: 13%|█▎ | 1.08G/8.53G [00:10<01:11, 104Mbyte/s] Writing: 13%|█▎ | 1.15G/8.53G [00:11<01:10, 105Mbyte/s] Writing: 14%|█▍ | 1.21G/8.53G [00:11<01:09, 105Mbyte/s] Writing: 14%|█▍ | 1.23G/8.53G [00:12<01:09, 106Mbyte/s] Writing: 15%|█▍ | 1.25G/8.53G [00:12<01:08, 106Mbyte/s] Writing: 15%|█▌ | 1.32G/8.53G [00:12<01:09, 104Mbyte/s] Writing: 16%|█▌ | 1.38G/8.53G [00:13<01:08, 105Mbyte/s] Writing: 17%|█▋ | 1.44G/8.53G [00:14<01:07, 105Mbyte/s] Writing: 17%|█▋ | 1.46G/8.53G [00:14<01:07, 105Mbyte/s] Writing: 17%|█▋ | 1.48G/8.53G [00:14<01:07, 105Mbyte/s] Writing: 18%|█▊ | 1.55G/8.53G [00:15<01:07, 103Mbyte/s] Writing: 19%|█▉ | 1.61G/8.53G [00:15<01:06, 104Mbyte/s] Writing: 20%|█▉ | 1.67G/8.53G [00:16<01:05, 105Mbyte/s] Writing: 20%|█▉ | 1.69G/8.53G [00:16<01:04, 105Mbyte/s] Writing: 20%|██ | 1.71G/8.53G [00:16<01:04, 105Mbyte/s] Writing: 21%|██ | 1.78G/8.53G [00:17<01:05, 104Mbyte/s] Writing: 22%|██▏ | 1.84G/8.53G [00:17<01:03, 105Mbyte/s] Writing: 22%|██▏ | 1.90G/8.53G [00:18<01:03, 105Mbyte/s] Writing: 23%|██▎ | 1.93G/8.53G [00:18<01:02, 106Mbyte/s] Writing: 23%|██▎ | 1.94G/8.53G [00:18<01:02, 106Mbyte/s] Writing: 24%|██▎ | 2.01G/8.53G [00:19<01:03, 103Mbyte/s] Writing: 24%|██▍ | 2.07G/8.53G [00:20<01:01, 105Mbyte/s] Writing: 25%|██▌ | 2.14G/8.53G [00:20<01:00, 105Mbyte/s] Writing: 25%|██▌ | 2.16G/8.53G [00:20<01:00, 106Mbyte/s] Writing: 26%|██▌ | 2.18G/8.53G [00:21<01:00, 106Mbyte/s] Writing: 26%|██▋ | 2.24G/8.53G [00:21<01:00, 104Mbyte/s] Writing: 27%|██▋ | 2.31G/8.53G [00:22<00:59, 105Mbyte/s] Writing: 28%|██▊ | 2.37G/8.53G [00:22<00:58, 105Mbyte/s] Writing: 28%|██▊ | 2.39G/8.53G [00:23<00:58, 106Mbyte/s] Writing: 28%|██▊ | 2.41G/8.53G [00:23<00:57, 106Mbyte/s] Writing: 29%|██▉ | 2.47G/8.53G [00:23<00:58, 104Mbyte/s] Writing: 30%|██▉ | 2.54G/8.53G [00:24<00:57, 104Mbyte/s] Writing: 30%|███ | 2.60G/8.53G [00:25<00:56, 104Mbyte/s] Writing: 31%|███ | 2.62G/8.53G [00:25<00:56, 105Mbyte/s] Writing: 31%|███ | 2.64G/8.53G [00:25<00:56, 105Mbyte/s] Writing: 32%|███▏ | 2.71G/8.53G [00:26<00:59, 98.0Mbyte/s] Writing: 32%|███▏ | 2.77G/8.53G [00:26<00:57, 101Mbyte/s] Writing: 33%|███▎ | 2.83G/8.53G [00:27<00:55, 103Mbyte/s] Writing: 33%|███▎ | 2.85G/8.53G [00:27<00:54, 104Mbyte/s] Writing: 34%|███▎ | 2.87G/8.53G [00:27<00:54, 104Mbyte/s] Writing: 34%|███▍ | 2.94G/8.53G [00:28<00:54, 103Mbyte/s] Writing: 35%|███▌ | 3.00G/8.53G [00:28<00:53, 104Mbyte/s] Writing: 36%|███▌ | 3.06G/8.53G [00:29<00:52, 105Mbyte/s] Writing: 36%|███▌ | 3.09G/8.53G [00:29<00:51, 105Mbyte/s] Writing: 36%|███▋ | 3.10G/8.53G [00:29<00:51, 105Mbyte/s] Writing: 37%|███▋ | 3.17G/8.53G [00:30<00:51, 104Mbyte/s] Writing: 38%|███▊ | 3.23G/8.53G [00:31<00:50, 105Mbyte/s] Writing: 39%|███▊ | 3.29G/8.53G [00:31<00:49, 105Mbyte/s] Writing: 39%|███▉ | 3.32G/8.53G [00:31<00:49, 106Mbyte/s] Writing: 39%|███▉ | 3.33G/8.53G [00:32<00:49, 106Mbyte/s] Writing: 40%|███▉ | 3.40G/8.53G [00:32<00:49, 104Mbyte/s] Writing: 41%|████ | 3.46G/8.53G [00:33<00:48, 104Mbyte/s] Writing: 41%|████▏ | 3.53G/8.53G [00:34<00:47, 104Mbyte/s] Writing: 42%|████▏ | 3.55G/8.53G [00:34<00:47, 105Mbyte/s] Writing: 42%|████▏ | 3.57G/8.53G [00:34<00:47, 104Mbyte/s] Writing: 43%|████▎ | 3.63G/8.53G [00:35<00:47, 102Mbyte/s] Writing: 43%|████▎ | 3.70G/8.53G [00:35<00:46, 104Mbyte/s] Writing: 44%|████▍ | 3.76G/8.53G [00:36<00:45, 104Mbyte/s] Writing: 44%|████▍ | 3.78G/8.53G [00:36<00:45, 105Mbyte/s] Writing: 45%|████▍ | 3.80G/8.53G [00:36<00:45, 105Mbyte/s] Writing: 45%|████▌ | 3.87G/8.53G [00:37<00:45, 103Mbyte/s] Writing: 46%|████▌ | 3.93G/8.53G [00:37<00:44, 104Mbyte/s] Writing: 47%|████▋ | 3.99G/8.53G [00:38<00:43, 105Mbyte/s] Writing: 47%|████▋ | 4.01G/8.53G [00:38<00:42, 105Mbyte/s] Writing: 47%|████▋ | 4.03G/8.53G [00:38<00:42, 105Mbyte/s] Writing: 48%|████▊ | 4.10G/8.53G [00:39<00:42, 103Mbyte/s] Writing: 49%|████▊ | 4.16G/8.53G [00:40<00:41, 104Mbyte/s] Writing: 49%|████▉ | 4.22G/8.53G [00:40<00:41, 105Mbyte/s] Writing: 50%|████▉ | 4.24G/8.53G [00:40<00:40, 105Mbyte/s] Writing: 50%|████▉ | 4.26G/8.53G [00:41<00:40, 105Mbyte/s] Writing: 51%|█████ | 4.33G/8.53G [00:41<00:40, 104Mbyte/s] Writing: 51%|█████▏ | 4.39G/8.53G [00:42<00:39, 104Mbyte/s] Writing: 52%|█████▏ | 4.45G/8.53G [00:42<00:38, 105Mbyte/s] Writing: 52%|█████▏ | 4.48G/8.53G [00:43<00:38, 106Mbyte/s] Writing: 53%|█████▎ | 4.49G/8.53G [00:43<00:38, 105Mbyte/s] Writing: 53%|█████▎ | 4.56G/8.53G [00:43<00:38, 104Mbyte/s] Writing: 54%|█████▍ | 4.62G/8.53G [00:44<00:37, 105Mbyte/s] Writing: 55%|█████▍ | 4.69G/8.53G [00:45<00:36, 104Mbyte/s] Writing: 55%|█████▌ | 4.71G/8.53G [00:45<00:36, 105Mbyte/s] Writing: 55%|█████▌ | 4.73G/8.53G [00:45<00:36, 105Mbyte/s] Writing: 56%|█████▌ | 4.79G/8.53G [00:46<00:36, 104Mbyte/s] Writing: 57%|█████▋ | 4.85G/8.53G [00:46<00:35, 104Mbyte/s] Writing: 58%|█████▊ | 4.92G/8.53G [00:47<00:34, 105Mbyte/s] Writing: 58%|█████▊ | 4.94G/8.53G [00:47<00:34, 105Mbyte/s] Writing: 58%|█████▊ | 4.96G/8.53G [00:47<00:33, 105Mbyte/s] Writing: 59%|█████▉ | 5.02G/8.53G [00:48<00:33, 106Mbyte/s] Writing: 59%|█████▉ | 5.05G/8.53G [00:48<00:32, 106Mbyte/s] Writing: 59%|█████▉ | 5.06G/8.53G [00:48<00:32, 106Mbyte/s] Writing: 60%|██████ | 5.13G/8.53G [00:49<00:32, 104Mbyte/s] Writing: 61%|██████ | 5.19G/8.53G [00:49<00:31, 105Mbyte/s] Writing: 62%|██████▏ | 5.26G/8.53G [00:50<00:31, 105Mbyte/s] Writing: 62%|██████▏ | 5.28G/8.53G [00:50<00:30, 106Mbyte/s] Writing: 62%|██████▏ | 5.30G/8.53G [00:50<00:30, 106Mbyte/s] Writing: 63%|██████▎ | 5.36G/8.53G [00:51<00:32, 98.7Mbyte/s] Writing: 64%|██████▎ | 5.43G/8.53G [00:52<00:30, 102Mbyte/s] Writing: 64%|██████▍ | 5.49G/8.53G [00:52<00:30, 101Mbyte/s] Writing: 65%|██████▌ | 5.55G/8.53G [00:53<00:28, 103Mbyte/s] Writing: 66%|██████▌ | 5.61G/8.53G [00:54<00:28, 104Mbyte/s] Writing: 66%|██████▌ | 5.63G/8.53G [00:54<00:27, 105Mbyte/s] Writing: 66%|██████▌ | 5.65G/8.53G [00:54<00:27, 105Mbyte/s] Writing: 67%|██████▋ | 5.72G/8.53G [00:55<00:27, 104Mbyte/s] Writing: 68%|██████▊ | 5.78G/8.53G [00:55<00:26, 105Mbyte/s] Writing: 68%|██████▊ | 5.84G/8.53G [00:56<00:25, 105Mbyte/s] Writing: 69%|██████▉ | 5.87G/8.53G [00:56<00:25, 106Mbyte/s] Writing: 69%|██████▉ | 5.88G/8.53G [00:56<00:25, 106Mbyte/s] Writing: 70%|██████▉ | 5.95G/8.53G [00:57<00:24, 104Mbyte/s] Writing: 70%|███████ | 6.01G/8.53G [00:57<00:24, 105Mbyte/s] Writing: 71%|███████ | 6.08G/8.53G [00:58<00:23, 105Mbyte/s] Writing: 71%|███████▏ | 6.10G/8.53G [00:58<00:22, 106Mbyte/s] Writing: 72%|███████▏ | 6.12G/8.53G [00:58<00:22, 106Mbyte/s] Writing: 72%|███████▏ | 6.18G/8.53G [00:59<00:22, 104Mbyte/s] Writing: 73%|███████▎ | 6.25G/8.53G [01:00<00:21, 105Mbyte/s] Writing: 74%|███████▍ | 6.31G/8.53G [01:00<00:21, 105Mbyte/s] Writing: 74%|███████▍ | 6.33G/8.53G [01:00<00:20, 106Mbyte/s] Writing: 74%|███████▍ | 6.35G/8.53G [01:01<00:20, 106Mbyte/s] Writing: 75%|███████▌ | 6.41G/8.53G [01:01<00:20, 104Mbyte/s] Writing: 76%|███████▌ | 6.48G/8.53G [01:02<00:19, 105Mbyte/s] Writing: 77%|███████▋ | 6.54G/8.53G [01:02<00:19, 105Mbyte/s] Writing: 77%|███████▋ | 6.56G/8.53G [01:03<00:18, 105Mbyte/s] Writing: 77%|███████▋ | 6.58G/8.53G [01:03<00:18, 105Mbyte/s] Writing: 78%|███████▊ | 6.65G/8.53G [01:03<00:18, 104Mbyte/s] Writing: 79%|███████▊ | 6.71G/8.53G [01:04<00:17, 105Mbyte/s] Writing: 79%|███████▉ | 6.77G/8.53G [01:05<00:16, 105Mbyte/s] Writing: 80%|███████▉ | 6.79G/8.53G [01:05<00:16, 106Mbyte/s] Writing: 80%|███████▉ | 6.81G/8.53G [01:05<00:16, 106Mbyte/s] Writing: 81%|████████ | 6.88G/8.53G [01:06<00:15, 104Mbyte/s] Writing: 81%|████████▏ | 6.94G/8.53G [01:06<00:15, 105Mbyte/s] Writing: 82%|████████▏ | 7.00G/8.53G [01:07<00:14, 106Mbyte/s] Writing: 82%|████████▏ | 7.03G/8.53G [01:07<00:14, 106Mbyte/s] Writing: 83%|████████▎ | 7.04G/8.53G [01:07<00:14, 106Mbyte/s] Writing: 83%|████████▎ | 7.11G/8.53G [01:08<00:13, 104Mbyte/s] Writing: 84%|████████▍ | 7.17G/8.53G [01:08<00:12, 105Mbyte/s] Writing: 85%|████████▍ | 7.23G/8.53G [01:09<00:12, 106Mbyte/s] Writing: 85%|████████▌ | 7.26G/8.53G [01:09<00:12, 106Mbyte/s] Writing: 85%|████████▌ | 7.27G/8.53G [01:09<00:11, 106Mbyte/s] Writing: 86%|████████▌ | 7.34G/8.53G [01:10<00:11, 104Mbyte/s] Writing: 87%|████████▋ | 7.40G/8.53G [01:11<00:10, 105Mbyte/s] Writing: 88%|████████▊ | 7.47G/8.53G [01:11<00:10, 106Mbyte/s] Writing: 88%|████████▊ | 7.49G/8.53G [01:11<00:09, 106Mbyte/s] Writing: 88%|████████▊ | 7.51G/8.53G [01:12<00:09, 106Mbyte/s] Writing: 89%|████████▉ | 7.57G/8.53G [01:12<00:09, 104Mbyte/s] Writing: 89%|████████▉ | 7.64G/8.53G [01:13<00:08, 105Mbyte/s] Writing: 90%|█████████ | 7.70G/8.53G [01:13<00:07, 105Mbyte/s] Writing: 90%|█████████ | 7.72G/8.53G [01:14<00:07, 105Mbyte/s] Writing: 91%|█████████ | 7.74G/8.53G [01:14<00:07, 105Mbyte/s] Writing: 91%|█████████▏| 7.81G/8.53G [01:14<00:06, 105Mbyte/s] Writing: 92%|█████████▏| 7.87G/8.53G [01:15<00:06, 105Mbyte/s] Writing: 92%|█████████▏| 7.89G/8.53G [01:15<00:06, 106Mbyte/s] Writing: 93%|█████████▎| 7.91G/8.53G [01:15<00:05, 106Mbyte/s] Writing: 99%|█████████▉| 8.47G/8.53G [01:21<00:00, 101Mbyte/s] Writing: 100%|█████████▉| 8.53G/8.53G [01:22<00:00, 101Mbyte/s] Writing: 100%|██████████| 8.53G/8.53G [01:22<00:00, 104Mbyte/s] INFO:hf-to-gguf:Model successfully exported to /home/alperen/grpo/model/unsloth.Q8_0.gguf Unsloth: Conversion completed! Output location: /home/alperen/grpo/model/unsloth.Q8_0.gguf Unsloth: Saved Ollama Modelfile to model/Modelfile wandb: wandb: 🚀 View run sft_train at: https://wandb.ai/alperenyildiz-nus/R4VD_Training/runs/rxcxxtd3 wandb: Find logs at: wandb/run-20250531_205259-rxcxxtd3/logs ` I am using the original Llama3_(8B)_Ollama.ipynb from Unsloth. Here are the dependencies I am using: ` accelerate==1.7.0 aiohappyeyeballs==2.4.4 aiohttp==3.10.11 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.5.2 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 arrow==1.3.0 asttokens==3.0.0 async-lru==2.0.5 async-timeout==4.0.3 attrs==25.1.0 babel==2.17.0 beautifulsoup4==4.13.4 bitsandbytes==0.45.5 bleach==6.2.0 certifi==2025.4.26 cffi==1.17.1 charset-normalizer==3.4.2 click==8.2.1 comm==0.2.2 cut-cross-entropy==25.1.1 dataclasses-json==0.6.7 datasets==3.6.0 debugpy==1.8.14 decorator==5.2.1 defusedxml==0.7.1 diffusers==0.33.1 dill==0.3.8 distro==1.9.0 docker-pycreds==0.4.0 docstring_parser==0.16 exceptiongroup==1.2.2 executing==2.2.0 faiss-cpu==1.8.0.post1 fastjsonschema==2.21.1 filelock==3.18.0 fqdn==1.5.1 frozenlist==1.5.0 fsspec==2025.3.0 gguf==0.16.3 gitdb==4.0.12 GitPython==3.1.44 greenlet==3.1.1 h11==0.14.0 hf-xet==1.1.2 hf_transfer==0.1.9 httpcore==1.0.7 httpx==0.28.1 huggingface-hub==0.32.3 idna==3.10 importlib_metadata==8.7.0 ipykernel==6.29.5 ipython==9.2.0 ipython_pygments_lexers==1.1.1 ipywidgets==8.1.7 isoduration==20.11.0 jedi==0.19.2 Jinja2==3.1.6 jiter==0.8.2 joblib==1.5.1 json5==0.12.0 jsonpatch==1.33 jsonpointer==3.0.0 jsonschema==4.24.0 jsonschema-specifications==2025.4.1 jupyter==1.1.1 jupyter-console==6.6.3 jupyter-events==0.12.0 jupyter-lsp==2.2.5 jupyter_client==8.6.3 jupyter_core==5.8.1 jupyter_server==2.16.0 jupyter_server_terminals==0.5.3 jupyterlab==4.4.3 jupyterlab_pygments==0.3.0 jupyterlab_server==2.27.3 jupyterlab_widgets==3.0.15 langchain==0.2.17 langchain-community==0.2.19 langchain-core==0.2.43 langchain-ollama==0.1.3 langchain-openai==0.1.25 langchain-text-splitters==0.2.4 langsmith==0.1.147 markdown-it-py==3.0.0 MarkupSafe==3.0.2 marshmallow==3.22.0 matplotlib-inline==0.1.7 mdurl==0.1.2 mistune==3.1.3 mpmath==1.3.0 msgspec==0.19.0 multidict==6.1.0 multiprocess==0.70.16 mypy-extensions==1.0.0 nbclient==0.10.2 nbconvert==7.16.6 nbformat==5.10.4 nest-asyncio==1.6.0 networkx==3.4.2 notebook==7.4.3 notebook_shim==0.2.4 numpy==2.0.2 nvidia-cublas-cu12==12.6.4.1 nvidia-cuda-cupti-cu12==12.6.80 nvidia-cuda-nvrtc-cu12==12.6.77 nvidia-cuda-runtime-cu12==12.6.77 nvidia-cudnn-cu12==9.5.1.17 nvidia-cufft-cu12==11.3.0.4 nvidia-cufile-cu12==1.11.1.6 nvidia-curand-cu12==10.3.7.77 nvidia-cusolver-cu12==11.7.1.2 nvidia-cusparse-cu12==12.5.4.2 nvidia-cusparselt-cu12==0.6.3 nvidia-nccl-cu12==2.26.2 nvidia-nvjitlink-cu12==12.6.85 nvidia-nvtx-cu12==12.6.77 ollama==0.4.7 openai==1.65.5 orjson==3.10.15 overrides==7.7.0 packaging==25.0 pandas==2.2.3 pandocfilters==1.5.1 parso==0.8.4 peft==0.15.2 pexpect==4.9.0 pillow==11.2.1 platformdirs==4.3.8 prometheus_client==0.22.0 prompt_toolkit==3.0.51 propcache==0.2.0 protobuf==3.20.3 psutil==7.0.0 ptyprocess==0.7.0 pure_eval==0.2.3 pyarrow==20.0.0 pycparser==2.22 pydantic==2.10.6 pydantic_core==2.27.2 Pygments==2.19.1 PyMuPDF==1.24.11 python-dateutil==2.9.0.post0 python-json-logger==3.3.0 pytz==2025.2 PyYAML==6.0.2 pyzmq==26.4.0 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 requests-toolbelt==1.0.0 rfc3339-validator==0.1.4 rfc3986-validator==0.1.1 rich==14.0.0 rpds-py==0.25.1 safetensors==0.5.3 scikit-learn==1.6.1 scipy==1.15.3 Send2Trash==1.8.3 sentence-transformers==4.1.0 sentencepiece==0.2.0 sentry-sdk==2.29.1 setproctitle==1.3.6 shtab==1.7.2 six==1.17.0 smmap==5.0.2 sniffio==1.3.1 soupsieve==2.7 SQLAlchemy==2.0.38 stack-data==0.6.3 sympy==1.14.0 tenacity==8.5.0 terminado==0.18.1 threadpoolctl==3.6.0 tiktoken==0.7.0 tinycss2==1.4.0 tokenizers==0.21.1 torch==2.7.0 torchvision==0.22.0 tornado==6.5.1 tqdm==4.67.1 traitlets==5.14.3 transformers==4.52.4 triton==3.3.0 trl==0.15.2 typeguard==4.4.2 types-python-dateutil==2.9.0.20250516 typing-inspect==0.9.0 typing_extensions==4.13.2 tyro==0.9.21 tzdata==2025.2 unsloth @ git+https://github.com/unslothai/unsloth.git@beef0cbcb6ecf1fa126589bd2877be85a91bfb8f unsloth_zoo==2025.5.11 uri-template==1.3.0 urllib3==2.4.0 wandb==0.19.11 wcwidth==0.2.13 webcolors==24.11.1 webencodings==0.5.1 websocket-client==1.8.0 widgetsnbextension==4.0.14 xformers==0.0.30 xxhash==3.5.0 yarl==1.15.2 zipp==3.21.0 ` ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.5.7
GiteaMirror added the bug label 2026-05-04 17:35:42 -05:00
Author
Owner

@rick-github commented on GitHub (May 31, 2025):

Does updating ollama help?

<!-- gh-comment-id:2925227986 --> @rick-github commented on GitHub (May 31, 2025): Does [updating ollama](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-upgrade-ollama) help?
Author
Owner

@alperen21 commented on GitHub (Jun 1, 2025):

Does updating ollama help?

upgrading to 0.9.0 only changes the error to:
Error: unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970

Ollama create model outputs
gathering model components copying file sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 100% parsing GGUF using existing layer sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 using existing layer sha256:95b5361453780fb5797ce5abfe9a330f5d33fdec13d2232ef1443ee0c3a86ecc using existing layer sha256:a00752320fd9088ddeea7cc185c72564737eb377034554e0bc7fa1cdf69ab36f writing manifest success

<!-- gh-comment-id:2926603335 --> @alperen21 commented on GitHub (Jun 1, 2025): > Does [updating ollama](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-upgrade-ollama) help? upgrading to 0.9.0 only changes the error to: ` Error: unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 ` Ollama create model outputs ` gathering model components copying file sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 100% parsing GGUF using existing layer sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 using existing layer sha256:95b5361453780fb5797ce5abfe9a330f5d33fdec13d2232ef1443ee0c3a86ecc using existing layer sha256:a00752320fd9088ddeea7cc185c72564737eb377034554e0bc7fa1cdf69ab36f writing manifest success `
Author
Owner

@rick-github commented on GitHub (Jun 1, 2025):

Server logs may aid in debugging.

<!-- gh-comment-id:2926701713 --> @rick-github commented on GitHub (Jun 1, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging.
Author
Owner

@alperen21 commented on GitHub (Jun 1, 2025):

Server logs may aid in debugging.

Nothing is logged:

journalctl -u ollama --no-pager --follow --pager-end Hint: You are currently not seeing messages from other users and the system. Users in groups 'adm', 'systemd-journal' can see all messages. Pass -q to turn off this notice. -- Logs begin at Sat 2025-04-26 03:57:25 +08. --

<!-- gh-comment-id:2926706579 --> @alperen21 commented on GitHub (Jun 1, 2025): > [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging. Nothing is logged: ` journalctl -u ollama --no-pager --follow --pager-end Hint: You are currently not seeing messages from other users and the system. Users in groups 'adm', 'systemd-journal' can see all messages. Pass -q to turn off this notice. -- Logs begin at Sat 2025-04-26 03:57:25 +08. -- `
Author
Owner

@rick-github commented on GitHub (Jun 1, 2025):

sudo journalctl -u ollama --no-pager -S today
<!-- gh-comment-id:2926713394 --> @rick-github commented on GitHub (Jun 1, 2025): ``` sudo journalctl -u ollama --no-pager -S today ```
Author
Owner

@alperen21 commented on GitHub (Jun 1, 2025):

sudo journalctl -u ollama --no-pager -S today

-- Logs begin at Fri 2025-04-25 23:59:59 +08, end at Sun 2025-06-01 15:19:55 +08. --
Jun 01 13:52:28 i2r-spd-0030576 systemd[1]: ollama.service: Succeeded.
Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: ollama.service: Scheduled restart job, restart counter is at 1.
Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: Stopped Ollama Service.
Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: Started Ollama Service.
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: 2025/06/01 13:52:31 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.963+08:00 level=INFO source=images.go:432 msg="total blobs: 6"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.965+08:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: - using env: export GIN_MODE=release
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: - using code: gin.SetMode(gin.ReleaseMode)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/pull --> github.com/ollama/ollama/server.(*Server).PullHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/generate --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/chat --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/embed --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/embeddings --> github.com/ollama/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/create --> github.com/ollama/ollama/server.(*Server).CreateHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/push --> github.com/ollama/ollama/server.(*Server).PushHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/copy --> github.com/ollama/ollama/server.(*Server).CopyHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] DELETE /api/delete --> github.com/ollama/ollama/server.(*Server).DeleteHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/show --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/ps --> github.com/ollama/ollama/server.(*Server).PsHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/chat/completions --> github.com/ollama/ollama/server.(Server).ChatHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/completions --> github.com/ollama/ollama/server.(Server).GenerateHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/embeddings --> github.com/ollama/ollama/server.(Server).EmbedHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /v1/models --> github.com/ollama/ollama/server.(Server).ListHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /v1/models/:model --> github.com/ollama/ollama/server.(Server).ShowHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET / --> github.com/ollama/ollama/server.(Server).GenerateRoutes.func1 (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/tags --> github.com/ollama/ollama/server.(Server).ListHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/version --> github.com/ollama/ollama/server.(Server).GenerateRoutes.func2 (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD / --> github.com/ollama/ollama/server.(Server).GenerateRoutes.func1 (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/tags --> github.com/ollama/ollama/server.(Server).ListHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/version --> github.com/ollama/ollama/server.(Server).GenerateRoutes.func2 (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=routes.go:1238 msg="Listening on 127.0.0.1:11434 (version 0.5.7)"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=routes.go:1267 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11_avx cuda_v12_avx rocm_avx]"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=gpu.go:226 msg="looking for compatible GPUs"
Jun 01 13:52:32 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:32.053+08:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3090" total="23.7 GiB" available="22.2 GiB"
Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Stopping Ollama Service...
Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: ollama.service: Succeeded.
Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Stopped Ollama Service.
Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Started Ollama Service.
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.261+08:00 level=INFO source=routes.go:1234 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:
https://localhost:
http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:
https://127.0.0.1:
http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:
https://0.0.0.0:
app://
file://
tauri://
vscode-webview://
vscode-file://
] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.263+08:00 level=INFO source=images.go:479 msg="total blobs: 6"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=images.go:486 msg="total unused blobs removed: 0"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=routes.go:1287 msg="Listening on 127.0.0.1:11434 (version 0.9.0)"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.368+08:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3090" total="23.7 GiB" available="22.2 GiB"
Jun 01 13:53:24 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:53:24 | 200 | 53.146µs | 127.0.0.1 | GET "/api/version"
Jun 01 13:53:54 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:53:54 | 200 | 18.698µs | 127.0.0.1 | HEAD "/"
Jun 01 13:54:11 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:11 | 200 | 56.416µs | 127.0.0.1 | POST "/api/blobs/sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970"
Jun 01 13:54:11 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:11 | 200 | 12.96016ms | 127.0.0.1 | POST "/api/create"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 200 | 21.979µs | 127.0.0.1 | HEAD "/"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 200 | 17.786978ms | 127.0.0.1 | POST "/api/show"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.806+08:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 gpu=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 parallel=2 available=23845470208 required="9.7 GiB"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.852+08:00 level=INFO source=server.go:135 msg="system memory" total="62.7 GiB" free="39.6 GiB" free_swap="0 B"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.852+08:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[22.2 GiB]" memory.gpu_overhead="0 B" memory.required.full="9.7 GiB" memory.required.partial="9.7 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[9.7 GiB]" memory.weights.total="7.4 GiB" memory.weights.repeating="6.9 GiB" memory.weights.nonrepeating="532.3 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 (version GGUF V3 (latest))
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 0: general.architecture str = llama
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 1: general.name str = model
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 2: llama.block_count u32 = 32
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 3: llama.context_length u32 = 131072
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 10: general.file_type u32 = 7
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", """, "#", "$", "%", "&", "'", ...
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128009
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128004
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 21: general.quantization_version u32 = 2
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type f32: 65 tensors
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type q8_0: 226 tensors
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file format = GGUF V3 (latest)
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file type = Q8_0
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file size = 7.95 GiB (8.50 BPW)
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_load_from_file_impl: failed to load model
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.902+08:00 level=INFO source=sched.go:455 msg="NewLlamaServer failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 error="unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 500 | 178.506735ms | 127.0.0.1 | POST "/api/generate"
Jun 01 13:54:46 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:46 | 200 | 22.997µs | 127.0.0.1 | GET "/api/version"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:35 | 200 | 16.303µs | 127.0.0.1 | HEAD "/"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/baseline_:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6a7c51e44480776aaf091a4f31ab4cd153364eb0d041b5a2d32f1845a8feebf2: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-coder-v2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34488e453cfe3232810bac05c55d94a471228086fcac9e6b00ef3a671e21fa66: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3:70b-instruct error="open /usr/share/ollama/.ollama/models/blobs/sha256-ea8e06d28e479230d9ea75e58a9c6fddad874fdb103a242988bf6bda3a49a085: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/plswork:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-bb11382a7789de02dc236a571472c09994b22906770b7cfbb2dfe639f76659f1: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-e934a1db7bb7d7202828298224a280df0c09985d1d35147c301574a20dfe5129: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/smollm-135m-instruct-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6592d34cb09d8fc73587e33c8a10b24009042ef781a61ad77dbcba80d74dae4f: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/testing:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-b94880755682395e160906f09f8aeee5dce2d8515d2e346dfaa55166c10d6f87: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-r1:1.5b error="open /usr/share/ollama/.ollama/models/blobs/sha256-a85fe2a2e58e2426116d3686dfdc1a6ea58640c1e684069976aa730be6c1fa01: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.1:8b error="open /usr/share/ollama/.ollama/models/blobs/sha256-455f34728c9b5dd3376378bfb809ee166c145b0b4c1f1a6feca069055066ef9a: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/phi:2.7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-4ce4b16d33a334b872b8cc4f9d6929905d0bfa19bdc90c5cbed95700d22f747f: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth_valid:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-94e9c8af1fe1dc86a7c901a3fb868164bf35f42ab34a2395c3b630ff8b44f21a: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-coder:1.3b error="open /usr/share/ollama/.ollama/models/blobs/sha256-d55c9eb1669a22f75956872166c676634c77cd8dfb94900640bd09a474dfcd0c: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_bert:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-14977e9a990f2f0b51e30486892df6230413f8027ff76298f4249c2fb97219d8: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_jaccard:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-3d0f8987b16341b857b7a13cde92fc230c4d96a93f7f06a7b929922a95552fd9: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_sft_2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-235236765ff77ca7bf2de13d4dd6f6cd9046e632c8b6f8936f6899bdc5542d92: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llamagrpo_vuln-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-49af9141aa1a0e9e89c252ac918273b9960256468087291b8d2c23436a0469d0: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/gemma:2b error="open /usr/share/ollama/.ollama/models/blobs/sha256-887433b89a901c156f7e6944442f3c9e57f3c55d6ed52042cbb7303aea994290: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mymodel:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-b94880755682395e160906f09f8aeee5dce2d8515d2e346dfaa55166c10d6f87: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mytest3:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-800fec5151b78ad63c6b5dbb73d514c00d5b370c9292e7bb6f824fd707595ba0: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/rlhf:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-43ae35401b26683bfc3a25735ec29e38e32d9d2cda2228dd73090e3dd90563ac: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/rlhf_model:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-eb2c36bd4e64f86cde431fd548624019d9eff04661c643cb9eb035f886303a40: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth_valid_full:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-c306eacce700a3e6dfa98d0c7a197d4eef8363985914191aee565f0717ef9dba: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=hf.co/alperenyildiz/Llama-3.2-1B-Instruct_q8_0_GRPO:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-59413b696cad54fe177854e01e5c2d519a2d52dc24501a063c790419da6bf3d0: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/codellama:7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-316526ac7323d6f42305c5bbf1939e1197487c1e6ea1f01292ceb5e3040b707a: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/hftest:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-1dc2a8b64db29f052b583cfd197c6c3e178f090306d4a6934b8b62e1dd4e94a9: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/qwen:1.8b error="open /usr/share/ollama/.ollama/models/blobs/sha256-9ece4a97bfb61bdb539531db5584fa119ad55684281d8a2d864339ae3fdd6c15: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-8100e67f2b81eac2c8c5beff8524a0797e07f90b90ce99c81b04642f2f7f64e3: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_testing:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6bbd3b8c4d39e9dd39400627de896f7d332c4ad70c53a7ccb0143df14cb5eb3b: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34a733ed52f1629251c5271080fefc67c21308740366ff1c3563634d185be301: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.1:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-1a4c3c319823fdabddb22479d0b10820a7a39fe49e45c40bae28fbe83926dc14: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/ollama_finetuned_grpo:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-0ce3c9184f3cc1c097a3758ef826f707b4f0c6f9848e5ae61ee09eecae1303e4: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/test:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-aefc9828fc72f25eee51b335dab430ab077f9fd0cd10aa80beb041f06e621d04: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_sft:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-79939762984cfb3c300a6bfdcccfa082f5dcff08ca0b0a57d400b0dc5c8a18b0: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_single:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-f7a62ea19aca03083e0efa70074db40736ef4a2d248329813548193b8b10a057: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.2:1b error="open /usr/share/ollama/.ollama/models/blobs/sha256-4f659a1e86d7f5a33c389f7991e7224b7ee6ad0358b53437d54c02d2e1b1118d: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mytest:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-3c866098594e4ec49cb7304dbd5d67b3b71b8c2d0225589d3d6e3d908caa52e8: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34bb5ab01051a11372a91f95f3fbbc51173eed8e7f13ec395b9ae9b8bd0e242b: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mistral:7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-42347cd80dc868877d2807869c0e9c90034392b2f1f001cae1563488021e2e19: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.098+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_test:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-77a2ccc8a0bedde7e1ce052ae894bc75c054e5b8594b514bcbe1e7ea6b85a1c7: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.098+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/smolgrpo_vuln-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-98e18d220d09de6cc3277405c524d54670b48f0d6f0c25cda62d99b185d9b327: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:35 | 200 | 2.020727ms | 127.0.0.1 | GET "/api/tags"
Jun 01 13:58:50 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:50 | 200 | 16.377µs | 127.0.0.1 | HEAD "/"
Jun 01 13:59:28 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:28 | 201 | 22.096522855s | 127.0.0.1 | POST "/api/blobs/sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970"
Jun 01 13:59:28 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:28 | 200 | 162.958513ms | 127.0.0.1 | POST "/api/create"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 200 | 42.797µs | 127.0.0.1 | HEAD "/"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 200 | 26.341074ms | 127.0.0.1 | POST "/api/show"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.653+08:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 gpu=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 parallel=2 available=23828299776 required="9.7 GiB"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.701+08:00 level=INFO source=server.go:135 msg="system memory" total="62.7 GiB" free="39.7 GiB" free_swap="0 B"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.701+08:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[22.2 GiB]" memory.gpu_overhead="0 B" memory.required.full="9.7 GiB" memory.required.partial="9.7 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[9.7 GiB]" memory.weights.total="7.4 GiB" memory.weights.repeating="6.9 GiB" memory.weights.nonrepeating="532.3 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 (version GGUF V3 (latest))
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 0: general.architecture str = llama
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 1: general.name str = model
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 2: llama.block_count u32 = 32
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 3: llama.context_length u32 = 131072
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 10: general.file_type u32 = 7
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", """, "#", "$", "%", "&", "'", ...
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128009
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128004
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 21: general.quantization_version u32 = 2
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type f32: 65 tensors
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type q8_0: 226 tensors
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file format = GGUF V3 (latest)
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file type = Q8_0
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file size = 7.95 GiB (8.50 BPW)
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_load_from_file_impl: failed to load model
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.728+08:00 level=INFO source=sched.go:455 msg="NewLlamaServer failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 error="unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 500 | 164.725683ms | 127.0.0.1 | POST "/api/generate"

<!-- gh-comment-id:2926720019 --> @alperen21 commented on GitHub (Jun 1, 2025): > ``` > sudo journalctl -u ollama --no-pager -S today > ``` -- Logs begin at Fri 2025-04-25 23:59:59 +08, end at Sun 2025-06-01 15:19:55 +08. -- Jun 01 13:52:28 i2r-spd-0030576 systemd[1]: ollama.service: Succeeded. Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: ollama.service: Scheduled restart job, restart counter is at 1. Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: Stopped Ollama Service. Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: Started Ollama Service. Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: 2025/06/01 13:52:31 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.963+08:00 level=INFO source=images.go:432 msg="total blobs: 6" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.965+08:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached. Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production. Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: - using env: export GIN_MODE=release Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: - using code: gin.SetMode(gin.ReleaseMode) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/pull --> github.com/ollama/ollama/server.(*Server).PullHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/generate --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/chat --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/embed --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/embeddings --> github.com/ollama/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/create --> github.com/ollama/ollama/server.(*Server).CreateHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/push --> github.com/ollama/ollama/server.(*Server).PushHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/copy --> github.com/ollama/ollama/server.(*Server).CopyHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] DELETE /api/delete --> github.com/ollama/ollama/server.(*Server).DeleteHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/show --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/ps --> github.com/ollama/ollama/server.(*Server).PsHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/chat/completions --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/completions --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/embeddings --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /v1/models --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /v1/models/:model --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET / --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/tags --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/version --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD / --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/tags --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/version --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=routes.go:1238 msg="Listening on 127.0.0.1:11434 (version 0.5.7)" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=routes.go:1267 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11_avx cuda_v12_avx rocm_avx]" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=gpu.go:226 msg="looking for compatible GPUs" Jun 01 13:52:32 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:32.053+08:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3090" total="23.7 GiB" available="22.2 GiB" Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Stopping Ollama Service... Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: ollama.service: Succeeded. Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Stopped Ollama Service. Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Started Ollama Service. Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.261+08:00 level=INFO source=routes.go:1234 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.263+08:00 level=INFO source=images.go:479 msg="total blobs: 6" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=images.go:486 msg="total unused blobs removed: 0" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=routes.go:1287 msg="Listening on 127.0.0.1:11434 (version 0.9.0)" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.368+08:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3090" total="23.7 GiB" available="22.2 GiB" Jun 01 13:53:24 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:53:24 | 200 | 53.146µs | 127.0.0.1 | GET "/api/version" Jun 01 13:53:54 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:53:54 | 200 | 18.698µs | 127.0.0.1 | HEAD "/" Jun 01 13:54:11 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:11 | 200 | 56.416µs | 127.0.0.1 | POST "/api/blobs/sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970" Jun 01 13:54:11 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:11 | 200 | 12.96016ms | 127.0.0.1 | POST "/api/create" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 200 | 21.979µs | 127.0.0.1 | HEAD "/" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 200 | 17.786978ms | 127.0.0.1 | POST "/api/show" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.806+08:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 gpu=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 parallel=2 available=23845470208 required="9.7 GiB" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.852+08:00 level=INFO source=server.go:135 msg="system memory" total="62.7 GiB" free="39.6 GiB" free_swap="0 B" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.852+08:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[22.2 GiB]" memory.gpu_overhead="0 B" memory.required.full="9.7 GiB" memory.required.partial="9.7 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[9.7 GiB]" memory.weights.total="7.4 GiB" memory.weights.repeating="6.9 GiB" memory.weights.nonrepeating="532.3 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 (version GGUF V3 (latest)) Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 0: general.architecture str = llama Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 1: general.name str = model Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 2: llama.block_count u32 = 32 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 3: llama.context_length u32 = 131072 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 10: general.file_type u32 = 7 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 11: llama.vocab_size u32 = 128256 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128009 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128004 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 21: general.quantization_version u32 = 2 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type f32: 65 tensors Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type q8_0: 226 tensors Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file format = GGUF V3 (latest) Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file type = Q8_0 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file size = 7.95 GiB (8.50 BPW) Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_load_from_file_impl: failed to load model Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.902+08:00 level=INFO source=sched.go:455 msg="NewLlamaServer failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 error="unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 500 | 178.506735ms | 127.0.0.1 | POST "/api/generate" Jun 01 13:54:46 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:46 | 200 | 22.997µs | 127.0.0.1 | GET "/api/version" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:35 | 200 | 16.303µs | 127.0.0.1 | HEAD "/" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/baseline_:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6a7c51e44480776aaf091a4f31ab4cd153364eb0d041b5a2d32f1845a8feebf2: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-coder-v2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34488e453cfe3232810bac05c55d94a471228086fcac9e6b00ef3a671e21fa66: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3:70b-instruct error="open /usr/share/ollama/.ollama/models/blobs/sha256-ea8e06d28e479230d9ea75e58a9c6fddad874fdb103a242988bf6bda3a49a085: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/plswork:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-bb11382a7789de02dc236a571472c09994b22906770b7cfbb2dfe639f76659f1: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-e934a1db7bb7d7202828298224a280df0c09985d1d35147c301574a20dfe5129: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/smollm-135m-instruct-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6592d34cb09d8fc73587e33c8a10b24009042ef781a61ad77dbcba80d74dae4f: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/testing:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-b94880755682395e160906f09f8aeee5dce2d8515d2e346dfaa55166c10d6f87: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-r1:1.5b error="open /usr/share/ollama/.ollama/models/blobs/sha256-a85fe2a2e58e2426116d3686dfdc1a6ea58640c1e684069976aa730be6c1fa01: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.1:8b error="open /usr/share/ollama/.ollama/models/blobs/sha256-455f34728c9b5dd3376378bfb809ee166c145b0b4c1f1a6feca069055066ef9a: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/phi:2.7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-4ce4b16d33a334b872b8cc4f9d6929905d0bfa19bdc90c5cbed95700d22f747f: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth_valid:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-94e9c8af1fe1dc86a7c901a3fb868164bf35f42ab34a2395c3b630ff8b44f21a: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-coder:1.3b error="open /usr/share/ollama/.ollama/models/blobs/sha256-d55c9eb1669a22f75956872166c676634c77cd8dfb94900640bd09a474dfcd0c: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_bert:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-14977e9a990f2f0b51e30486892df6230413f8027ff76298f4249c2fb97219d8: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_jaccard:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-3d0f8987b16341b857b7a13cde92fc230c4d96a93f7f06a7b929922a95552fd9: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_sft_2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-235236765ff77ca7bf2de13d4dd6f6cd9046e632c8b6f8936f6899bdc5542d92: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llamagrpo_vuln-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-49af9141aa1a0e9e89c252ac918273b9960256468087291b8d2c23436a0469d0: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/gemma:2b error="open /usr/share/ollama/.ollama/models/blobs/sha256-887433b89a901c156f7e6944442f3c9e57f3c55d6ed52042cbb7303aea994290: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mymodel:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-b94880755682395e160906f09f8aeee5dce2d8515d2e346dfaa55166c10d6f87: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mytest3:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-800fec5151b78ad63c6b5dbb73d514c00d5b370c9292e7bb6f824fd707595ba0: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/rlhf:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-43ae35401b26683bfc3a25735ec29e38e32d9d2cda2228dd73090e3dd90563ac: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/rlhf_model:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-eb2c36bd4e64f86cde431fd548624019d9eff04661c643cb9eb035f886303a40: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth_valid_full:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-c306eacce700a3e6dfa98d0c7a197d4eef8363985914191aee565f0717ef9dba: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=hf.co/alperenyildiz/Llama-3.2-1B-Instruct_q8_0_GRPO:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-59413b696cad54fe177854e01e5c2d519a2d52dc24501a063c790419da6bf3d0: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/codellama:7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-316526ac7323d6f42305c5bbf1939e1197487c1e6ea1f01292ceb5e3040b707a: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/hftest:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-1dc2a8b64db29f052b583cfd197c6c3e178f090306d4a6934b8b62e1dd4e94a9: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/qwen:1.8b error="open /usr/share/ollama/.ollama/models/blobs/sha256-9ece4a97bfb61bdb539531db5584fa119ad55684281d8a2d864339ae3fdd6c15: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-8100e67f2b81eac2c8c5beff8524a0797e07f90b90ce99c81b04642f2f7f64e3: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_testing:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6bbd3b8c4d39e9dd39400627de896f7d332c4ad70c53a7ccb0143df14cb5eb3b: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34a733ed52f1629251c5271080fefc67c21308740366ff1c3563634d185be301: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.1:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-1a4c3c319823fdabddb22479d0b10820a7a39fe49e45c40bae28fbe83926dc14: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/ollama_finetuned_grpo:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-0ce3c9184f3cc1c097a3758ef826f707b4f0c6f9848e5ae61ee09eecae1303e4: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/test:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-aefc9828fc72f25eee51b335dab430ab077f9fd0cd10aa80beb041f06e621d04: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_sft:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-79939762984cfb3c300a6bfdcccfa082f5dcff08ca0b0a57d400b0dc5c8a18b0: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_single:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-f7a62ea19aca03083e0efa70074db40736ef4a2d248329813548193b8b10a057: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.2:1b error="open /usr/share/ollama/.ollama/models/blobs/sha256-4f659a1e86d7f5a33c389f7991e7224b7ee6ad0358b53437d54c02d2e1b1118d: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mytest:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-3c866098594e4ec49cb7304dbd5d67b3b71b8c2d0225589d3d6e3d908caa52e8: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34bb5ab01051a11372a91f95f3fbbc51173eed8e7f13ec395b9ae9b8bd0e242b: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mistral:7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-42347cd80dc868877d2807869c0e9c90034392b2f1f001cae1563488021e2e19: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.098+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_test:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-77a2ccc8a0bedde7e1ce052ae894bc75c054e5b8594b514bcbe1e7ea6b85a1c7: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.098+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/smolgrpo_vuln-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-98e18d220d09de6cc3277405c524d54670b48f0d6f0c25cda62d99b185d9b327: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:35 | 200 | 2.020727ms | 127.0.0.1 | GET "/api/tags" Jun 01 13:58:50 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:50 | 200 | 16.377µs | 127.0.0.1 | HEAD "/" Jun 01 13:59:28 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:28 | 201 | 22.096522855s | 127.0.0.1 | POST "/api/blobs/sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970" Jun 01 13:59:28 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:28 | 200 | 162.958513ms | 127.0.0.1 | POST "/api/create" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 200 | 42.797µs | 127.0.0.1 | HEAD "/" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 200 | 26.341074ms | 127.0.0.1 | POST "/api/show" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.653+08:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 gpu=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 parallel=2 available=23828299776 required="9.7 GiB" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.701+08:00 level=INFO source=server.go:135 msg="system memory" total="62.7 GiB" free="39.7 GiB" free_swap="0 B" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.701+08:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[22.2 GiB]" memory.gpu_overhead="0 B" memory.required.full="9.7 GiB" memory.required.partial="9.7 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[9.7 GiB]" memory.weights.total="7.4 GiB" memory.weights.repeating="6.9 GiB" memory.weights.nonrepeating="532.3 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 (version GGUF V3 (latest)) Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 0: general.architecture str = llama Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 1: general.name str = model Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 2: llama.block_count u32 = 32 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 3: llama.context_length u32 = 131072 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 10: general.file_type u32 = 7 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 11: llama.vocab_size u32 = 128256 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128009 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128004 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 21: general.quantization_version u32 = 2 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type f32: 65 tensors Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type q8_0: 226 tensors Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file format = GGUF V3 (latest) Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file type = Q8_0 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file size = 7.95 GiB (8.50 BPW) Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_load_from_file_impl: failed to load model Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.728+08:00 level=INFO source=sched.go:455 msg="NewLlamaServer failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 error="unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 500 | 164.725683ms | 127.0.0.1 | POST "/api/generate"
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#69252