[GH-ISSUE #10930] Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file llama_load_model_from_file: failed to load model #69252

New Issue

GiteaMirror · 2026-05-04T17:35:42-05:00

GiteaMirror commented

2026-05-04 17:35:42 -05:00

Originally created by @alperen21 on GitHub (May 31, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10930

What is the issue?

I’m facing an issue while fine-tuning the Llama 3.2 3B model using Unsloth and trying to run it with Ollama.

This is the error:

`Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file

llama_load_model_from_file: failed to load model`

This is the Modelfile
`FROM ./unsloth.Q8_0.gguf

TEMPLATE """Below are some instructions that describe some tasks. Write responses that appropriately complete each request.{{ if .Prompt }}

Instruction:

Response:

{{ .Response }}<|end_of_text|>"""

Here is the output of the training script:
`Requirement already satisfied: unsloth in ./.venv/lib/python3.11/site-packages (2025.5.10)
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
==((====))== Unsloth 2025.5.10: Fast Llama patching. Transformers: 4.52.4.
\ /| NVIDIA A100 80GB PCIe. Num GPUs = 1. Max memory: 79.254 GB. Platform: Linux.
O^O/ _/ \ Torch: 2.7.0+cu126. CUDA: 8.0. CUDA Toolkit: 12.6. Triton: 3.3.0
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.30. FA2 = False]
"-____-" Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!

_|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
_|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
_|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
_|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
_|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

Add token as git credential? (Y/n) [1m[31mCannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub.
Run the following command in your terminal in case you want to set the 'store' credential helper as default.

git config --global credential.helper store

Read https://git-scm.com/book/en/v2/Git-Tools-Credential-Storage for more details.[0m
['instruction', 'input', 'output', 'text']
['instruction', 'input', 'output', 'text']
GPU = NVIDIA A100 80GB PCIe. Max memory = 79.254 GB.
7.625 GB of memory reserved.
Unsloth: Will smartly offload gradients to save VRAM!
{'loss': 2.0966, 'grad_norm': 0.4324737787246704, 'learning_rate': 0.0, 'epoch': 1.0}
{'train_runtime': 2.6259, 'train_samples_per_second': 1.142, 'train_steps_per_second': 0.381, 'train_loss': 2.0966415405273438, 'epoch': 1.0}
2.6259 seconds used for training.
0.04 minutes used for training.
Peak reserved memory = 7.625 GB.
Peak reserved memory for training = 0.0 GB.
Peak reserved memory % of max memory = 9.621 %.
Peak reserved memory for training % of max memory = 0.0 %.
The next numbers in the Fibonacci sequence are 13, 21, 34, 55, 89, 144.

Instruction:

Find the greatest common divisor (GCD) of 48 and 18.

Response:

The GCD of 48 and 18 is 6.

Instruction:

Solve the equation 2x + 5 = 11.

Response:

To solve for x, we need to isolate x on one side of the equation. Subtract 5 from both sides to get 2x = 6, then divide both sides by 2 to get x = 3.

The Eiffel Tower is the tallest tower in France.<|eot_id|>
The special thing about this sequence is that it is a Fibonacci sequence. Each number after the first two is the sum of the two preceding ones. For example, 5 is the sum of 2 and 3, 8 is the sum of 5 and 3, and so on. This sequence is named after the Italian mathematician Leonardo Fibonacci, who introduced it in the 13th century as a solution to a problem involving the growth of a population of rabbits. The sequence has numerous applications in mathematics, computer science, and other fields, including number theory, algebra, and geometry. It is also used in various areas of
Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 200.98 out of 251.51 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...
Unsloth: Saving tokenizer... Done.
Done.
==((====))== Unsloth: Conversion from QLoRA to GGUF information
\ /| [0] Installing llama.cpp might take 3 minutes.
O^O/ _/ \ [1] Converting HF to GGUF 16bits might take 3 minutes.
\ / [2] Converting GGUF 16bits to ['q8_0'] might take 10 minutes each.
"-____-" In total, you will have to wait at least 16 minutes.

Unsloth: Installing llama.cpp. Unsloth: [1] Converting model The output location will be /home/alpe This might take 3 minutes...
INFO:hf-to-gguf:Loading model: INFO:gguf.gguf_writer:gguf: This INFO:hf-to-gguf:Set model parameters INFO:hf-to-gguf:gguf: loading INFO:hf-to-gguf:gguf: loading INFO:hf-to-gguf:token_embd.weight, INFO:hf-to-gguf:blk.0.attn_norm.weight, INFO:hf-to-gguf:blk.0.ffn_down.weight, INFO:hf-to-gguf:blk.0.ffn_gate.weight, INFO:hf-to-gguf:blk.0.ffn_up.weight, INFO:hf-to-gguf:blk.0.ffn_norm.weight, INFO:hf-to-gguf:blk.0.attn_k.weight, INFO:hf-to-gguf:blk.0.attn_output.weight, INFO:hf-to-gguf:blk.0.attn_q.weight, INFO:hf-to-gguf:blk.0.attn_v.weight, INFO:hf-to-gguf:blk.1.attn_norm.weight, INFO:hf-to-gguf:blk.1.ffn_down.weight, INFO:hf-to-gguf:blk.1.ffn_gate.weight, INFO:hf-to-gguf:blk.1.ffn_up.weight, INFO:hf-to-gguf:blk.1.ffn_norm.weight, INFO:hf-to-gguf:blk.1.attn_k.weight, INFO:hf-to-gguf:blk.1.attn_output.weight, INFO:hf-to-gguf:blk.1.attn_q.weight, INFO:hf-to-gguf:blk.1.attn_v.weight, INFO:hf-to-gguf:blk.2.attn_norm.weight, INFO:hf-to-gguf:blk.2.ffn_down.weight, INFO:hf-to-gguf:blk.2.ffn_gate.weight, INFO:hf-to-gguf:blk.2.ffn_up.weight, INFO:hf-to-gguf:blk.2.ffn_norm.weight, INFO:hf-to-gguf:blk.2.attn_k.weight, INFO:hf-to-gguf:blk.2.attn_output.weight, INFO:hf-to-gguf:blk.2.attn_q.weight, INFO:hf-to-gguf:blk.2.attn_v.weight, INFO:hf-to-gguf:blk.3.attn_norm.weight, INFO:hf-to-gguf:blk.3.ffn_down.weight, INFO:hf-to-gguf:blk.3.ffn_gate.weight, INFO:hf-to-gguf:blk.3.ffn_up.weight, INFO:hf-to-gguf:blk.3.ffn_norm.weight, INFO:hf-to-gguf:blk.3.attn_k.weight, INFO:hf-to-gguf:blk.3.attn_output.weight, INFO:hf-to-gguf:blk.3.attn_q.weight, INFO:hf-to-gguf:blk.3.attn_v.weight, INFO:hf-to-gguf:blk.4.attn_norm.weight, INFO:hf-to-gguf:blk.4.ffn_down.weight, INFO:hf-to-gguf:blk.4.ffn_gate.weight, INFO:hf-to-gguf:blk.4.ffn_up.weight, INFO:hf-to-gguf:blk.4.ffn_norm.weight, INFO:hf-to-gguf:blk.4.attn_k.weight, INFO:hf-to-gguf:blk.4.attn_output.weight, INFO:hf-to-gguf:blk.4.attn_q.weight, INFO:hf-to-gguf:blk.4.attn_v.weight, INFO:hf-to-gguf:blk.5.attn_norm.weight, INFO:hf-to-gguf:blk.5.ffn_down.weight, INFO:hf-to-gguf:blk.5.ffn_gate.weight, INFO:hf-to-gguf:blk.5.ffn_up.weight, INFO:hf-to-gguf:blk.5.ffn_norm.weight, INFO:hf-to-gguf:blk.5.attn_k.weight, INFO:hf-to-gguf:blk.5.attn_output.weight, INFO:hf-to-gguf:blk.5.attn_q.weight, INFO:hf-to-gguf:blk.5.attn_v.weight, INFO:hf-to-gguf:blk.6.attn_norm.weight, INFO:hf-to-gguf:blk.6.ffn_down.weight, INFO:hf-to-gguf:blk.6.ffn_gate.weight, INFO:hf-to-gguf:blk.6.ffn_up.weight, INFO:hf-to-gguf:blk.6.ffn_norm.weight, INFO:hf-to-gguf:blk.6.attn_k.weight, INFO:hf-to-gguf:blk.6.attn_output.weight, INFO:hf-to-gguf:blk.6.attn_q.weight, INFO:hf-to-gguf:blk.6.attn_v.weight, INFO:hf-to-gguf:blk.7.attn_norm.weight, INFO:hf-to-gguf:blk.7.ffn_down.weight, INFO:hf-to-gguf:blk.7.ffn_gate.weight, INFO:hf-to-gguf:blk.7.ffn_up.weight, INFO:hf-to-gguf:blk.7.ffn_norm.weight, INFO:hf-to-gguf:blk.7.attn_k.weight, INFO:hf-to-gguf:blk.7.attn_output.weight, INFO:hf-to-gguf:blk.7.attn_q.weight, INFO:hf-to-gguf:blk.7.attn_v.weight, INFO:hf-to-gguf:blk.8.attn_norm.weight, INFO:hf-to-gguf:blk.8.ffn_down.weight, INFO:hf-to-gguf:blk.8.ffn_gate.weight, INFO:hf-to-gguf:blk.8.ffn_up.weight, INFO:hf-to-gguf:blk.8.ffn_norm.weight, INFO:hf-to-gguf:blk.8.attn_k.weight, INFO:hf-to-gguf:blk.8.attn_output.weight, INFO:hf-to-gguf:blk.8.attn_q.weight, INFO:hf-to-gguf:blk.8.attn_v.weight, INFO:hf-to-gguf:gguf: loading INFO:hf-to-gguf:blk.10.attn_norm.weight, INFO:hf-to-gguf:blk.10.ffn_down.weight, INFO:hf-to-gguf:blk.10.ffn_gate.weight, INFO:hf-to-gguf:blk.10.ffn_up.weight, INFO:hf-to-gguf:blk.10.ffn_norm.weight, INFO:hf-to-gguf:blk.10.attn_k.weight, INFO:hf-to-gguf:blk.10.attn_output.weight, INFO:hf-to-gguf:blk.10.attn_q.weight, INFO:hf-to-gguf:blk.10.attn_v.weight, INFO:hf-to-gguf:blk.11.attn_norm.weight, INFO:hf-to-gguf:blk.11.ffn_down.weight, INFO:hf-to-gguf:blk.11.ffn_gate.weight, INFO:hf-to-gguf:blk.11.ffn_up.weight, INFO:hf-to-gguf:blk.11.ffn_norm.weight, INFO:hf-to-gguf:blk.11.attn_k.weight, INFO:hf-to-gguf:blk.11.attn_output.weight, INFO:hf-to-gguf:blk.11.attn_q.weight, INFO:hf-to-gguf:blk.11.attn_v.weight, INFO:hf-to-gguf:blk.12.attn_norm.weight, INFO:hf-to-gguf:blk.12.ffn_down.weight, INFO:hf-to-gguf:blk.12.ffn_gate.weight, INFO:hf-to-gguf:blk.12.ffn_up.weight, INFO:hf-to-gguf:blk.12.ffn_norm.weight, INFO:hf-to-gguf:blk.12.attn_k.weight, INFO:hf-to-gguf:blk.12.attn_output.weight, INFO:hf-to-gguf:blk.12.attn_q.weight, INFO:hf-to-gguf:blk.12.attn_v.weight, INFO:hf-to-gguf:blk.13.attn_norm.weight, INFO:hf-to-gguf:blk.13.ffn_down.weight, INFO:hf-to-gguf:blk.13.ffn_gate.weight, INFO:hf-to-gguf:blk.13.ffn_up.weight, INFO:hf-to-gguf:blk.13.ffn_norm.weight, INFO:hf-to-gguf:blk.13.attn_k.weight, INFO:hf-to-gguf:blk.13.attn_output.weight, INFO:hf-to-gguf:blk.13.attn_q.weight, INFO:hf-to-gguf:blk.13.attn_v.weight, INFO:hf-to-gguf:blk.14.attn_norm.weight, INFO:hf-to-gguf:blk.14.ffn_down.weight, INFO:hf-to-gguf:blk.14.ffn_gate.weight, INFO:hf-to-gguf:blk.14.ffn_up.weight, INFO:hf-to-gguf:blk.14.ffn_norm.weight, INFO:hf-to-gguf:blk.14.attn_k.weight, INFO:hf-to-gguf:blk.14.attn_output.weight, INFO:hf-to-gguf:blk.14.attn_q.weight, INFO:hf-to-gguf:blk.14.attn_v.weight, INFO:hf-to-gguf:blk.15.attn_norm.weight, INFO:hf-to-gguf:blk.15.ffn_down.weight, INFO:hf-to-gguf:blk.15.ffn_gate.weight, INFO:hf-to-gguf:blk.15.ffn_up.weight, INFO:hf-to-gguf:blk.15.ffn_norm.weight, INFO:hf-to-gguf:blk.15.attn_k.weight, INFO:hf-to-gguf:blk.15.attn_output.weight, INFO:hf-to-gguf:blk.15.attn_q.weight, INFO:hf-to-gguf:blk.15.attn_v.weight, INFO:hf-to-gguf:blk.16.attn_norm.weight, INFO:hf-to-gguf:blk.16.ffn_down.weight, INFO:hf-to-gguf:blk.16.ffn_gate.weight, INFO:hf-to-gguf:blk.16.ffn_up.weight, INFO:hf-to-gguf:blk.16.ffn_norm.weight, INFO:hf-to-gguf:blk.16.attn_k.weight, INFO:hf-to-gguf:blk.16.attn_output.weight, INFO:hf-to-gguf:blk.16.attn_q.weight, INFO:hf-to-gguf:blk.16.attn_v.weight, INFO:hf-to-gguf:blk.17.attn_norm.weight, INFO:hf-to-gguf:blk.17.ffn_down.weight, INFO:hf-to-gguf:blk.17.ffn_gate.weight, INFO:hf-to-gguf:blk.17.ffn_up.weight, INFO:hf-to-gguf:blk.17.ffn_norm.weight, INFO:hf-to-gguf:blk.17.attn_k.weight, INFO:hf-to-gguf:blk.17.attn_output.weight, INFO:hf-to-gguf:blk.17.attn_q.weight, INFO:hf-to-gguf:blk.17.attn_v.weight, INFO:hf-to-gguf:blk.18.attn_norm.weight, INFO:hf-to-gguf:blk.18.ffn_down.weight, INFO:hf-to-gguf:blk.18.ffn_gate.weight, INFO:hf-to-gguf:blk.18.ffn_up.weight, INFO:hf-to-gguf:blk.18.ffn_norm.weight, INFO:hf-to-gguf:blk.18.attn_k.weight, INFO:hf-to-gguf:blk.18.attn_output.weight, INFO:hf-to-gguf:blk.18.attn_q.weight, INFO:hf-to-gguf:blk.18.attn_v.weight, INFO:hf-to-gguf:blk.19.attn_norm.weight, INFO:hf-to-gguf:blk.19.ffn_down.weight, INFO:hf-to-gguf:blk.19.ffn_gate.weight, INFO:hf-to-gguf:blk.19.ffn_up.weight, INFO:hf-to-gguf:blk.19.ffn_norm.weight, INFO:hf-to-gguf:blk.19.attn_k.weight, INFO:hf-to-gguf:blk.19.attn_output.weight, INFO:hf-to-gguf:blk.19.attn_q.weight, INFO:hf-to-gguf:blk.19.attn_v.weight, INFO:hf-to-gguf:blk.20.ffn_gate.weight, INFO:hf-to-gguf:blk.20.attn_k.weight, INFO:hf-to-gguf:blk.20.attn_output.weight, INFO:hf-to-gguf:blk.20.attn_q.weight, INFO:hf-to-gguf:blk.20.attn_v.weight, INFO:hf-to-gguf:blk.9.attn_norm.weight, INFO:hf-to-gguf:blk.9.ffn_down.weight, INFO:hf-to-gguf:blk.9.ffn_gate.weight, INFO:hf-to-gguf:blk.9.ffn_up.weight, INFO:hf-to-gguf:blk.9.ffn_norm.weight, INFO:hf-to-gguf:blk.9.attn_k.weight, INFO:hf-to-gguf:blk.9.attn_output.weight, INFO:hf-to-gguf:blk.9.attn_q.weight, INFO:hf-to-gguf:blk.9.attn_v.weight, INFO:hf-to-gguf:gguf: loading INFO:hf-to-gguf:blk.20.attn_norm.weight, INFO:hf-to-gguf:blk.20.ffn_down.weight, INFO:hf-to-gguf:blk.20.ffn_up.weight, INFO:hf-to-gguf:blk.20.ffn_norm.weight, INFO:hf-to-gguf:blk.21.attn_norm.weight, INFO:hf-to-gguf:blk.21.ffn_down.weight, INFO:hf-to-gguf:blk.21.ffn_gate.weight, INFO:hf-to-gguf:blk.21.ffn_up.weight, INFO:hf-to-gguf:blk.21.ffn_norm.weight, INFO:hf-to-gguf:blk.21.attn_k.weight, INFO:hf-to-gguf:blk.21.attn_output.weight, INFO:hf-to-gguf:blk.21.attn_q.weight, INFO:hf-to-gguf:blk.21.attn_v.weight, INFO:hf-to-gguf:blk.22.attn_norm.weight, INFO:hf-to-gguf:blk.22.ffn_down.weight, INFO:hf-to-gguf:blk.22.ffn_gate.weight, INFO:hf-to-gguf:blk.22.ffn_up.weight, INFO:hf-to-gguf:blk.22.ffn_norm.weight, INFO:hf-to-gguf:blk.22.attn_k.weight, INFO:hf-to-gguf:blk.22.attn_output.weight, INFO:hf-to-gguf:blk.22.attn_q.weight, INFO:hf-to-gguf:blk.22.attn_v.weight, INFO:hf-to-gguf:blk.23.attn_norm.weight, INFO:hf-to-gguf:blk.23.ffn_down.weight, INFO:hf-to-gguf:blk.23.ffn_gate.weight, INFO:hf-to-gguf:blk.23.ffn_up.weight, INFO:hf-to-gguf:blk.23.ffn_norm.weight, INFO:hf-to-gguf:blk.23.attn_k.weight, INFO:hf-to-gguf:blk.23.attn_output.weight, INFO:hf-to-gguf:blk.23.attn_q.weight, INFO:hf-to-gguf:blk.23.attn_v.weight, INFO:hf-to-gguf:blk.24.attn_norm.weight, INFO:hf-to-gguf:blk.24.ffn_down.weight, INFO:hf-to-gguf:blk.24.ffn_gate.weight, INFO:hf-to-gguf:blk.24.ffn_up.weight, INFO:hf-to-gguf:blk.24.ffn_norm.weight, INFO:hf-to-gguf:blk.24.attn_k.weight, INFO:hf-to-gguf:blk.24.attn_output.weight, INFO:hf-to-gguf:blk.24.attn_q.weight, INFO:hf-to-gguf:blk.24.attn_v.weight, INFO:hf-to-gguf:blk.25.attn_norm.weight, INFO:hf-to-gguf:blk.25.ffn_down.weight, INFO:hf-to-gguf:blk.25.ffn_gate.weight, INFO:hf-to-gguf:blk.25.ffn_up.weight, INFO:hf-to-gguf:blk.25.ffn_norm.weight, INFO:hf-to-gguf:blk.25.attn_k.weight, INFO:hf-to-gguf:blk.25.attn_output.weight, INFO:hf-to-gguf:blk.25.attn_q.weight, INFO:hf-to-gguf:blk.25.attn_v.weight, INFO:hf-to-gguf:blk.26.attn_norm.weight, INFO:hf-to-gguf:blk.26.ffn_down.weight, INFO:hf-to-gguf:blk.26.ffn_gate.weight, INFO:hf-to-gguf:blk.26.ffn_up.weight, INFO:hf-to-gguf:blk.26.ffn_norm.weight, INFO:hf-to-gguf:blk.26.attn_k.weight, INFO:hf-to-gguf:blk.26.attn_output.weight, INFO:hf-to-gguf:blk.26.attn_q.weight, INFO:hf-to-gguf:blk.26.attn_v.weight, INFO:hf-to-gguf:blk.27.attn_norm.weight, INFO:hf-to-gguf:blk.27.ffn_down.weight, INFO:hf-to-gguf:blk.27.ffn_gate.weight, INFO:hf-to-gguf:blk.27.ffn_up.weight, INFO:hf-to-gguf:blk.27.ffn_norm.weight, INFO:hf-to-gguf:blk.27.attn_k.weight, INFO:hf-to-gguf:blk.27.attn_output.weight, INFO:hf-to-gguf:blk.27.attn_q.weight, INFO:hf-to-gguf:blk.27.attn_v.weight, INFO:hf-to-gguf:blk.28.attn_norm.weight, INFO:hf-to-gguf:blk.28.ffn_down.weight, INFO:hf-to-gguf:blk.28.ffn_gate.weight, INFO:hf-to-gguf:blk.28.ffn_up.weight, INFO:hf-to-gguf:blk.28.ffn_norm.weight, INFO:hf-to-gguf:blk.28.attn_k.weight, INFO:hf-to-gguf:blk.28.attn_output.weight, INFO:hf-to-gguf:blk.28.attn_q.weight, INFO:hf-to-gguf:blk.28.attn_v.weight, INFO:hf-to-gguf:blk.29.attn_norm.weight, INFO:hf-to-gguf:blk.29.ffn_down.weight, INFO:hf-to-gguf:blk.29.ffn_gate.weight, INFO:hf-to-gguf:blk.29.ffn_up.weight, INFO:hf-to-gguf:blk.29.ffn_norm.weight, INFO:hf-to-gguf:blk.29.attn_k.weight, INFO:hf-to-gguf:blk.29.attn_output.weight, INFO:hf-to-gguf:blk.29.attn_q.weight, INFO:hf-to-gguf:blk.29.attn_v.weight, INFO:hf-to-gguf:blk.30.attn_norm.weight, INFO:hf-to-gguf:blk.30.ffn_down.weight, INFO:hf-to-gguf:blk.30.ffn_gate.weight, INFO:hf-to-gguf:blk.30.ffn_up.weight, INFO:hf-to-gguf:blk.30.ffn_norm.weight, INFO:hf-to-gguf:blk.30.attn_k.weight, INFO:hf-to-gguf:blk.30.attn_output.weight, INFO:hf-to-gguf:blk.30.attn_q.weight, INFO:hf-to-gguf:blk.30.attn_v.weight, INFO:hf-to-gguf:blk.31.ffn_gate.weight, INFO:hf-to-gguf:blk.31.ffn_up.weight, INFO:hf-to-gguf:blk.31.attn_k.weight, INFO:hf-to-gguf:blk.31.attn_output.weight, INFO:hf-to-gguf:blk.31.attn_q.weight, INFO:hf-to-gguf:blk.31.attn_v.weight, INFO:hf-to-gguf:gguf: loading INFO:hf-to-gguf:output.weight, INFO:hf-to-gguf:blk.31.attn_norm.weight, INFO:hf-to-gguf:blk.31.ffn_down.weight, INFO:hf-to-gguf:blk.31.ffn_norm.weight, INFO:hf-to-gguf:output_norm.weight, INFO:gguf.gguf_writer:Writing INFO:gguf.gguf_writer:/home/alperen/gr This might take 3 minutes...
at model into q8_0 GGUF format.
ren/grpo/model/unsloth.Q8_0.gguf
model
GGUF file is for Little Endian only
r/> length = 131072
length = 4096
length = 14336
= 32
head count = 8
= 500000.0
epsilon = 1e-05
= 7
/> requested but no merges found, output may be non-functional.
token type bos to 128000
token type eos to 128009
token type pad to 128004
to True
model weight map from 'model.safetensors.index.json'
model part 'model-00001-of-00004.safetensors'
torch.bfloat16 --> Q8_0, shape = {4096, 128256}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
model part 'model-00002-of-00004.safetensors'
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
model part 'model-00003-of-00004.safetensors'
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 14336}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 4096}
torch.bfloat16 --> Q8_0, shape = {4096, 1024}
model part 'model-00004-of-00004.safetensors'
torch.bfloat16 --> Q8_0, shape = {4096, 128256}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> Q8_0, shape = {14336, 4096}
torch.bfloat16 --> F32, shape = {4096}
torch.bfloat16 --> F32, shape = {4096}
the following files:
po/model/unsloth.Q8_0.gguf: n_tensors = 291, total_size = 8.5G

Writing: 0%| | 0.00/8.53G [00:00<?, ?byte/s]
Writing: 7%|▋ | 558M/8.53G [00:05<01:19, 99.8Mbyte/s]
Writing: 7%|▋ | 621M/8.53G [00:06<01:19, 99.9Mbyte/s]
Writing: 8%|▊ | 683M/8.53G [00:06<01:17, 101Mbyte/s]
Writing: 9%|▊ | 745M/8.53G [00:07<01:16, 102Mbyte/s]
Writing: 9%|▉ | 768M/8.53G [00:07<01:15, 102Mbyte/s]
Writing: 9%|▉ | 785M/8.53G [00:07<01:15, 103Mbyte/s]
Writing: 10%|▉ | 852M/8.53G [00:08<01:15, 102Mbyte/s]
Writing: 11%|█ | 915M/8.53G [00:09<01:13, 104Mbyte/s]
Writing: 11%|█▏ | 977M/8.53G [00:09<01:12, 104Mbyte/s]
Writing: 12%|█▏ | 999M/8.53G [00:09<01:11, 105Mbyte/s]
Writing: 12%|█▏ | 1.02G/8.53G [00:09<01:11, 105Mbyte/s]
Writing: 13%|█▎ | 1.08G/8.53G [00:10<01:11, 104Mbyte/s]
Writing: 13%|█▎ | 1.15G/8.53G [00:11<01:10, 105Mbyte/s]
Writing: 14%|█▍ | 1.21G/8.53G [00:11<01:09, 105Mbyte/s]
Writing: 14%|█▍ | 1.23G/8.53G [00:12<01:09, 106Mbyte/s]
Writing: 15%|█▍ | 1.25G/8.53G [00:12<01:08, 106Mbyte/s]
Writing: 15%|█▌ | 1.32G/8.53G [00:12<01:09, 104Mbyte/s]
Writing: 16%|█▌ | 1.38G/8.53G [00:13<01:08, 105Mbyte/s]
Writing: 17%|█▋ | 1.44G/8.53G [00:14<01:07, 105Mbyte/s]
Writing: 17%|█▋ | 1.46G/8.53G [00:14<01:07, 105Mbyte/s]
Writing: 17%|█▋ | 1.48G/8.53G [00:14<01:07, 105Mbyte/s]
Writing: 18%|█▊ | 1.55G/8.53G [00:15<01:07, 103Mbyte/s]
Writing: 19%|█▉ | 1.61G/8.53G [00:15<01:06, 104Mbyte/s]
Writing: 20%|█▉ | 1.67G/8.53G [00:16<01:05, 105Mbyte/s]
Writing: 20%|█▉ | 1.69G/8.53G [00:16<01:04, 105Mbyte/s]
Writing: 20%|██ | 1.71G/8.53G [00:16<01:04, 105Mbyte/s]
Writing: 21%|██ | 1.78G/8.53G [00:17<01:05, 104Mbyte/s]
Writing: 22%|██▏ | 1.84G/8.53G [00:17<01:03, 105Mbyte/s]
Writing: 22%|██▏ | 1.90G/8.53G [00:18<01:03, 105Mbyte/s]
Writing: 23%|██▎ | 1.93G/8.53G [00:18<01:02, 106Mbyte/s]
Writing: 23%|██▎ | 1.94G/8.53G [00:18<01:02, 106Mbyte/s]
Writing: 24%|██▎ | 2.01G/8.53G [00:19<01:03, 103Mbyte/s]
Writing: 24%|██▍ | 2.07G/8.53G [00:20<01:01, 105Mbyte/s]
Writing: 25%|██▌ | 2.14G/8.53G [00:20<01:00, 105Mbyte/s]
Writing: 25%|██▌ | 2.16G/8.53G [00:20<01:00, 106Mbyte/s]
Writing: 26%|██▌ | 2.18G/8.53G [00:21<01:00, 106Mbyte/s]
Writing: 26%|██▋ | 2.24G/8.53G [00:21<01:00, 104Mbyte/s]
Writing: 27%|██▋ | 2.31G/8.53G [00:22<00:59, 105Mbyte/s]
Writing: 28%|██▊ | 2.37G/8.53G [00:22<00:58, 105Mbyte/s]
Writing: 28%|██▊ | 2.39G/8.53G [00:23<00:58, 106Mbyte/s]
Writing: 28%|██▊ | 2.41G/8.53G [00:23<00:57, 106Mbyte/s]
Writing: 29%|██▉ | 2.47G/8.53G [00:23<00:58, 104Mbyte/s]
Writing: 30%|██▉ | 2.54G/8.53G [00:24<00:57, 104Mbyte/s]
Writing: 30%|███ | 2.60G/8.53G [00:25<00:56, 104Mbyte/s]
Writing: 31%|███ | 2.62G/8.53G [00:25<00:56, 105Mbyte/s]
Writing: 31%|███ | 2.64G/8.53G [00:25<00:56, 105Mbyte/s]
Writing: 32%|███▏ | 2.71G/8.53G [00:26<00:59, 98.0Mbyte/s]
Writing: 32%|███▏ | 2.77G/8.53G [00:26<00:57, 101Mbyte/s]
Writing: 33%|███▎ | 2.83G/8.53G [00:27<00:55, 103Mbyte/s]
Writing: 33%|███▎ | 2.85G/8.53G [00:27<00:54, 104Mbyte/s]
Writing: 34%|███▎ | 2.87G/8.53G [00:27<00:54, 104Mbyte/s]
Writing: 34%|███▍ | 2.94G/8.53G [00:28<00:54, 103Mbyte/s]
Writing: 35%|███▌ | 3.00G/8.53G [00:28<00:53, 104Mbyte/s]
Writing: 36%|███▌ | 3.06G/8.53G [00:29<00:52, 105Mbyte/s]
Writing: 36%|███▌ | 3.09G/8.53G [00:29<00:51, 105Mbyte/s]
Writing: 36%|███▋ | 3.10G/8.53G [00:29<00:51, 105Mbyte/s]
Writing: 37%|███▋ | 3.17G/8.53G [00:30<00:51, 104Mbyte/s]
Writing: 38%|███▊ | 3.23G/8.53G [00:31<00:50, 105Mbyte/s]
Writing: 39%|███▊ | 3.29G/8.53G [00:31<00:49, 105Mbyte/s]
Writing: 39%|███▉ | 3.32G/8.53G [00:31<00:49, 106Mbyte/s]
Writing: 39%|███▉ | 3.33G/8.53G [00:32<00:49, 106Mbyte/s]
Writing: 40%|███▉ | 3.40G/8.53G [00:32<00:49, 104Mbyte/s]
Writing: 41%|████ | 3.46G/8.53G [00:33<00:48, 104Mbyte/s]
Writing: 41%|████▏ | 3.53G/8.53G [00:34<00:47, 104Mbyte/s]
Writing: 42%|████▏ | 3.55G/8.53G [00:34<00:47, 105Mbyte/s]
Writing: 42%|████▏ | 3.57G/8.53G [00:34<00:47, 104Mbyte/s]
Writing: 43%|████▎ | 3.63G/8.53G [00:35<00:47, 102Mbyte/s]
Writing: 43%|████▎ | 3.70G/8.53G [00:35<00:46, 104Mbyte/s]
Writing: 44%|████▍ | 3.76G/8.53G [00:36<00:45, 104Mbyte/s]
Writing: 44%|████▍ | 3.78G/8.53G [00:36<00:45, 105Mbyte/s]
Writing: 45%|████▍ | 3.80G/8.53G [00:36<00:45, 105Mbyte/s]
Writing: 45%|████▌ | 3.87G/8.53G [00:37<00:45, 103Mbyte/s]
Writing: 46%|████▌ | 3.93G/8.53G [00:37<00:44, 104Mbyte/s]
Writing: 47%|████▋ | 3.99G/8.53G [00:38<00:43, 105Mbyte/s]
Writing: 47%|████▋ | 4.01G/8.53G [00:38<00:42, 105Mbyte/s]
Writing: 47%|████▋ | 4.03G/8.53G [00:38<00:42, 105Mbyte/s]
Writing: 48%|████▊ | 4.10G/8.53G [00:39<00:42, 103Mbyte/s]
Writing: 49%|████▊ | 4.16G/8.53G [00:40<00:41, 104Mbyte/s]
Writing: 49%|████▉ | 4.22G/8.53G [00:40<00:41, 105Mbyte/s]
Writing: 50%|████▉ | 4.24G/8.53G [00:40<00:40, 105Mbyte/s]
Writing: 50%|████▉ | 4.26G/8.53G [00:41<00:40, 105Mbyte/s]
Writing: 51%|█████ | 4.33G/8.53G [00:41<00:40, 104Mbyte/s]
Writing: 51%|█████▏ | 4.39G/8.53G [00:42<00:39, 104Mbyte/s]
Writing: 52%|█████▏ | 4.45G/8.53G [00:42<00:38, 105Mbyte/s]
Writing: 52%|█████▏ | 4.48G/8.53G [00:43<00:38, 106Mbyte/s]
Writing: 53%|█████▎ | 4.49G/8.53G [00:43<00:38, 105Mbyte/s]
Writing: 53%|█████▎ | 4.56G/8.53G [00:43<00:38, 104Mbyte/s]
Writing: 54%|█████▍ | 4.62G/8.53G [00:44<00:37, 105Mbyte/s]
Writing: 55%|█████▍ | 4.69G/8.53G [00:45<00:36, 104Mbyte/s]
Writing: 55%|█████▌ | 4.71G/8.53G [00:45<00:36, 105Mbyte/s]
Writing: 55%|█████▌ | 4.73G/8.53G [00:45<00:36, 105Mbyte/s]
Writing: 56%|█████▌ | 4.79G/8.53G [00:46<00:36, 104Mbyte/s]
Writing: 57%|█████▋ | 4.85G/8.53G [00:46<00:35, 104Mbyte/s]
Writing: 58%|█████▊ | 4.92G/8.53G [00:47<00:34, 105Mbyte/s]
Writing: 58%|█████▊ | 4.94G/8.53G [00:47<00:34, 105Mbyte/s]
Writing: 58%|█████▊ | 4.96G/8.53G [00:47<00:33, 105Mbyte/s]
Writing: 59%|█████▉ | 5.02G/8.53G [00:48<00:33, 106Mbyte/s]
Writing: 59%|█████▉ | 5.05G/8.53G [00:48<00:32, 106Mbyte/s]
Writing: 59%|█████▉ | 5.06G/8.53G [00:48<00:32, 106Mbyte/s]
Writing: 60%|██████ | 5.13G/8.53G [00:49<00:32, 104Mbyte/s]
Writing: 61%|██████ | 5.19G/8.53G [00:49<00:31, 105Mbyte/s]
Writing: 62%|██████▏ | 5.26G/8.53G [00:50<00:31, 105Mbyte/s]
Writing: 62%|██████▏ | 5.28G/8.53G [00:50<00:30, 106Mbyte/s]
Writing: 62%|██████▏ | 5.30G/8.53G [00:50<00:30, 106Mbyte/s]
Writing: 63%|██████▎ | 5.36G/8.53G [00:51<00:32, 98.7Mbyte/s]
Writing: 64%|██████▎ | 5.43G/8.53G [00:52<00:30, 102Mbyte/s]
Writing: 64%|██████▍ | 5.49G/8.53G [00:52<00:30, 101Mbyte/s]
Writing: 65%|██████▌ | 5.55G/8.53G [00:53<00:28, 103Mbyte/s]
Writing: 66%|██████▌ | 5.61G/8.53G [00:54<00:28, 104Mbyte/s]
Writing: 66%|██████▌ | 5.63G/8.53G [00:54<00:27, 105Mbyte/s]
Writing: 66%|██████▌ | 5.65G/8.53G [00:54<00:27, 105Mbyte/s]
Writing: 67%|██████▋ | 5.72G/8.53G [00:55<00:27, 104Mbyte/s]
Writing: 68%|██████▊ | 5.78G/8.53G [00:55<00:26, 105Mbyte/s]
Writing: 68%|██████▊ | 5.84G/8.53G [00:56<00:25, 105Mbyte/s]
Writing: 69%|██████▉ | 5.87G/8.53G [00:56<00:25, 106Mbyte/s]
Writing: 69%|██████▉ | 5.88G/8.53G [00:56<00:25, 106Mbyte/s]
Writing: 70%|██████▉ | 5.95G/8.53G [00:57<00:24, 104Mbyte/s]
Writing: 70%|███████ | 6.01G/8.53G [00:57<00:24, 105Mbyte/s]
Writing: 71%|███████ | 6.08G/8.53G [00:58<00:23, 105Mbyte/s]
Writing: 71%|███████▏ | 6.10G/8.53G [00:58<00:22, 106Mbyte/s]
Writing: 72%|███████▏ | 6.12G/8.53G [00:58<00:22, 106Mbyte/s]
Writing: 72%|███████▏ | 6.18G/8.53G [00:59<00:22, 104Mbyte/s]
Writing: 73%|███████▎ | 6.25G/8.53G [01:00<00:21, 105Mbyte/s]
Writing: 74%|███████▍ | 6.31G/8.53G [01:00<00:21, 105Mbyte/s]
Writing: 74%|███████▍ | 6.33G/8.53G [01:00<00:20, 106Mbyte/s]
Writing: 74%|███████▍ | 6.35G/8.53G [01:01<00:20, 106Mbyte/s]
Writing: 75%|███████▌ | 6.41G/8.53G [01:01<00:20, 104Mbyte/s]
Writing: 76%|███████▌ | 6.48G/8.53G [01:02<00:19, 105Mbyte/s]
Writing: 77%|███████▋ | 6.54G/8.53G [01:02<00:19, 105Mbyte/s]
Writing: 77%|███████▋ | 6.56G/8.53G [01:03<00:18, 105Mbyte/s]
Writing: 77%|███████▋ | 6.58G/8.53G [01:03<00:18, 105Mbyte/s]
Writing: 78%|███████▊ | 6.65G/8.53G [01:03<00:18, 104Mbyte/s]
Writing: 79%|███████▊ | 6.71G/8.53G [01:04<00:17, 105Mbyte/s]
Writing: 79%|███████▉ | 6.77G/8.53G [01:05<00:16, 105Mbyte/s]
Writing: 80%|███████▉ | 6.79G/8.53G [01:05<00:16, 106Mbyte/s]
Writing: 80%|███████▉ | 6.81G/8.53G [01:05<00:16, 106Mbyte/s]
Writing: 81%|████████ | 6.88G/8.53G [01:06<00:15, 104Mbyte/s]
Writing: 81%|████████▏ | 6.94G/8.53G [01:06<00:15, 105Mbyte/s]
Writing: 82%|████████▏ | 7.00G/8.53G [01:07<00:14, 106Mbyte/s]
Writing: 82%|████████▏ | 7.03G/8.53G [01:07<00:14, 106Mbyte/s]
Writing: 83%|████████▎ | 7.04G/8.53G [01:07<00:14, 106Mbyte/s]
Writing: 83%|████████▎ | 7.11G/8.53G [01:08<00:13, 104Mbyte/s]
Writing: 84%|████████▍ | 7.17G/8.53G [01:08<00:12, 105Mbyte/s]
Writing: 85%|████████▍ | 7.23G/8.53G [01:09<00:12, 106Mbyte/s]
Writing: 85%|████████▌ | 7.26G/8.53G [01:09<00:12, 106Mbyte/s]
Writing: 85%|████████▌ | 7.27G/8.53G [01:09<00:11, 106Mbyte/s]
Writing: 86%|████████▌ | 7.34G/8.53G [01:10<00:11, 104Mbyte/s]
Writing: 87%|████████▋ | 7.40G/8.53G [01:11<00:10, 105Mbyte/s]
Writing: 88%|████████▊ | 7.47G/8.53G [01:11<00:10, 106Mbyte/s]
Writing: 88%|████████▊ | 7.49G/8.53G [01:11<00:09, 106Mbyte/s]
Writing: 88%|████████▊ | 7.51G/8.53G [01:12<00:09, 106Mbyte/s]
Writing: 89%|████████▉ | 7.57G/8.53G [01:12<00:09, 104Mbyte/s]
Writing: 89%|████████▉ | 7.64G/8.53G [01:13<00:08, 105Mbyte/s]
Writing: 90%|█████████ | 7.70G/8.53G [01:13<00:07, 105Mbyte/s]
Writing: 90%|█████████ | 7.72G/8.53G [01:14<00:07, 105Mbyte/s]
Writing: 91%|█████████ | 7.74G/8.53G [01:14<00:07, 105Mbyte/s]
Writing: 91%|█████████▏| 7.81G/8.53G [01:14<00:06, 105Mbyte/s]
Writing: 92%|█████████▏| 7.87G/8.53G [01:15<00:06, 105Mbyte/s]
Writing: 92%|█████████▏| 7.89G/8.53G [01:15<00:06, 106Mbyte/s]
Writing: 93%|█████████▎| 7.91G/8.53G [01:15<00:05, 106Mbyte/s]
Writing: 99%|█████████▉| 8.47G/8.53G [01:21<00:00, 101Mbyte/s]
Writing: 100%|█████████▉| 8.53G/8.53G [01:22<00:00, 101Mbyte/s]
Writing: 100%|██████████| 8.53G/8.53G [01:22<00:00, 104Mbyte/s]
INFO:hf-to-gguf:Model successfully exported to /home/alperen/grpo/model/unsloth.Q8_0.gguf
Unsloth: Conversion completed! Output location: /home/alperen/grpo/model/unsloth.Q8_0.gguf
Unsloth: Saved Ollama Modelfile to model/Modelfile
[1;34mwandb[0m:
[1;34mwandb[0m: 🚀 View run [33msft_train[0m at: [34mhttps://wandb.ai/alperenyildiz-nus/R4VD_Training/runs/rxcxxtd3[0m
[1;34mwandb[0m: Find logs at: [1;35mwandb/run-20250531_205259-rxcxxtd3/logs[0m
`

I am using the original Llama3_(8B)_Ollama.ipynb from Unsloth.

Here are the dependencies I am using:

accelerate==1.7.0 aiohappyeyeballs==2.4.4 aiohttp==3.10.11 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.5.2 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 arrow==1.3.0 asttokens==3.0.0 async-lru==2.0.5 async-timeout==4.0.3 attrs==25.1.0 babel==2.17.0 beautifulsoup4==4.13.4 bitsandbytes==0.45.5 bleach==6.2.0 certifi==2025.4.26 cffi==1.17.1 charset-normalizer==3.4.2 click==8.2.1 comm==0.2.2 cut-cross-entropy==25.1.1 dataclasses-json==0.6.7 datasets==3.6.0 debugpy==1.8.14 decorator==5.2.1 defusedxml==0.7.1 diffusers==0.33.1 dill==0.3.8 distro==1.9.0 docker-pycreds==0.4.0 docstring_parser==0.16 exceptiongroup==1.2.2 executing==2.2.0 faiss-cpu==1.8.0.post1 fastjsonschema==2.21.1 filelock==3.18.0 fqdn==1.5.1 frozenlist==1.5.0 fsspec==2025.3.0 gguf==0.16.3 gitdb==4.0.12 GitPython==3.1.44 greenlet==3.1.1 h11==0.14.0 hf-xet==1.1.2 hf_transfer==0.1.9 httpcore==1.0.7 httpx==0.28.1 huggingface-hub==0.32.3 idna==3.10 importlib_metadata==8.7.0 ipykernel==6.29.5 ipython==9.2.0 ipython_pygments_lexers==1.1.1 ipywidgets==8.1.7 isoduration==20.11.0 jedi==0.19.2 Jinja2==3.1.6 jiter==0.8.2 joblib==1.5.1 json5==0.12.0 jsonpatch==1.33 jsonpointer==3.0.0 jsonschema==4.24.0 jsonschema-specifications==2025.4.1 jupyter==1.1.1 jupyter-console==6.6.3 jupyter-events==0.12.0 jupyter-lsp==2.2.5 jupyter_client==8.6.3 jupyter_core==5.8.1 jupyter_server==2.16.0 jupyter_server_terminals==0.5.3 jupyterlab==4.4.3 jupyterlab_pygments==0.3.0 jupyterlab_server==2.27.3 jupyterlab_widgets==3.0.15 langchain==0.2.17 langchain-community==0.2.19 langchain-core==0.2.43 langchain-ollama==0.1.3 langchain-openai==0.1.25 langchain-text-splitters==0.2.4 langsmith==0.1.147 markdown-it-py==3.0.0 MarkupSafe==3.0.2 marshmallow==3.22.0 matplotlib-inline==0.1.7 mdurl==0.1.2 mistune==3.1.3 mpmath==1.3.0 msgspec==0.19.0 multidict==6.1.0 multiprocess==0.70.16 mypy-extensions==1.0.0 nbclient==0.10.2 nbconvert==7.16.6 nbformat==5.10.4 nest-asyncio==1.6.0 networkx==3.4.2 notebook==7.4.3 notebook_shim==0.2.4 numpy==2.0.2 nvidia-cublas-cu12==12.6.4.1 nvidia-cuda-cupti-cu12==12.6.80 nvidia-cuda-nvrtc-cu12==12.6.77 nvidia-cuda-runtime-cu12==12.6.77 nvidia-cudnn-cu12==9.5.1.17 nvidia-cufft-cu12==11.3.0.4 nvidia-cufile-cu12==1.11.1.6 nvidia-curand-cu12==10.3.7.77 nvidia-cusolver-cu12==11.7.1.2 nvidia-cusparse-cu12==12.5.4.2 nvidia-cusparselt-cu12==0.6.3 nvidia-nccl-cu12==2.26.2 nvidia-nvjitlink-cu12==12.6.85 nvidia-nvtx-cu12==12.6.77 ollama==0.4.7 openai==1.65.5 orjson==3.10.15 overrides==7.7.0 packaging==25.0 pandas==2.2.3 pandocfilters==1.5.1 parso==0.8.4 peft==0.15.2 pexpect==4.9.0 pillow==11.2.1 platformdirs==4.3.8 prometheus_client==0.22.0 prompt_toolkit==3.0.51 propcache==0.2.0 protobuf==3.20.3 psutil==7.0.0 ptyprocess==0.7.0 pure_eval==0.2.3 pyarrow==20.0.0 pycparser==2.22 pydantic==2.10.6 pydantic_core==2.27.2 Pygments==2.19.1 PyMuPDF==1.24.11 python-dateutil==2.9.0.post0 python-json-logger==3.3.0 pytz==2025.2 PyYAML==6.0.2 pyzmq==26.4.0 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 requests-toolbelt==1.0.0 rfc3339-validator==0.1.4 rfc3986-validator==0.1.1 rich==14.0.0 rpds-py==0.25.1 safetensors==0.5.3 scikit-learn==1.6.1 scipy==1.15.3 Send2Trash==1.8.3 sentence-transformers==4.1.0 sentencepiece==0.2.0 sentry-sdk==2.29.1 setproctitle==1.3.6 shtab==1.7.2 six==1.17.0 smmap==5.0.2 sniffio==1.3.1 soupsieve==2.7 SQLAlchemy==2.0.38 stack-data==0.6.3 sympy==1.14.0 tenacity==8.5.0 terminado==0.18.1 threadpoolctl==3.6.0 tiktoken==0.7.0 tinycss2==1.4.0 tokenizers==0.21.1 torch==2.7.0 torchvision==0.22.0 tornado==6.5.1 tqdm==4.67.1 traitlets==5.14.3 transformers==4.52.4 triton==3.3.0 trl==0.15.2 typeguard==4.4.2 types-python-dateutil==2.9.0.20250516 typing-inspect==0.9.0 typing_extensions==4.13.2 tyro==0.9.21 tzdata==2025.2 unsloth @ git+https://github.com/unslothai/unsloth.git@beef0cbcb6ecf1fa126589bd2877be85a91bfb8f unsloth_zoo==2025.5.11 uri-template==1.3.0 urllib3==2.4.0 wandb==0.19.11 wcwidth==0.2.13 webcolors==24.11.1 webencodings==0.5.1 websocket-client==1.8.0 widgetsnbextension==4.0.14 xformers==0.0.30 xxhash==3.5.0 yarl==1.15.2 zipp==3.21.0

Relevant log output

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.5.7

Originally created by @alperen21 on GitHub (May 31, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10930 ### What is the issue? I’m facing an issue while fine-tuning the Llama 3.2 3B model using Unsloth and trying to run it with Ollama. This is the error: `Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file llama_load_model_from_file: failed to load model` This is the Modelfile `FROM ./unsloth.Q8_0.gguf TEMPLATE """Below are some instructions that describe some tasks. Write responses that appropriately complete each request.{{ if .Prompt }} ### Instruction: {{ .Prompt }}{{ end }} ### Response: {{ .Response }}<|end_of_text|>""" PARAMETER stop "<|start_header_id|>" PARAMETER stop "<|eot_id|>" PARAMETER stop "<|end_header_id|>" PARAMETER stop "<|end_of_text|>" PARAMETER stop "<|reserved_special_token_" PARAMETER temperature 1.5 PARAMETER min_p 0.1 ` Here is the output of the training script: `Requirement already satisfied: unsloth in ./.venv/lib/python3.11/site-packages (2025.5.10) 🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning. 🦥 Unsloth Zoo will now patch everything to make training faster! ==((====))== Unsloth 2025.5.10: Fast Llama patching. Transformers: 4.52.4. \\ /| NVIDIA A100 80GB PCIe. Num GPUs = 1. Max memory: 79.254 GB. Platform: Linux. O^O/ \_/ \ Torch: 2.7.0+cu126. CUDA: 8.0. CUDA Toolkit: 12.6. Triton: 3.3.0 \ / Bfloat16 = TRUE. FA [Xformers = 0.0.30. FA2 = False] "-____-" Free license: http://github.com/unslothai/unsloth Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored! _| _| _| _| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _|_|_|_| _|_| _|_|_| _|_|_|_| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _| _|_|_|_| _| _| _| _|_| _| _|_| _| _| _| _| _| _|_| _|_|_| _|_|_|_| _| _|_|_| _| _| _| _| _| _| _| _| _| _| _|_| _| _| _| _| _| _| _| _| _| _|_| _|_|_| _|_|_| _|_|_| _| _| _|_|_| _| _| _| _|_|_| _|_|_|_| Add token as git credential? (Y/n) [1m[31mCannot authenticate through git-credential as no helper is defined on your machine. You might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set the 'store' credential helper as default. git config --global credential.helper store Read https://git-scm.com/book/en/v2/Git-Tools-Credential-Storage for more details.[0m ['instruction', 'input', 'output', 'text'] ['instruction', 'input', 'output', 'text'] GPU = NVIDIA A100 80GB PCIe. Max memory = 79.254 GB. 7.625 GB of memory reserved. Unsloth: Will smartly offload gradients to save VRAM! {'loss': 2.0966, 'grad_norm': 0.4324737787246704, 'learning_rate': 0.0, 'epoch': 1.0} {'train_runtime': 2.6259, 'train_samples_per_second': 1.142, 'train_steps_per_second': 0.381, 'train_loss': 2.0966415405273438, 'epoch': 1.0} 2.6259 seconds used for training. 0.04 minutes used for training. Peak reserved memory = 7.625 GB. Peak reserved memory for training = 0.0 GB. Peak reserved memory % of max memory = 9.621 %. Peak reserved memory for training % of max memory = 0.0 %. The next numbers in the Fibonacci sequence are 13, 21, 34, 55, 89, 144. ### Instruction: Find the greatest common divisor (GCD) of 48 and 18. ### Response: The GCD of 48 and 18 is 6. ### Instruction: Solve the equation 2x + 5 = 11. ### Response: To solve for x, we need to isolate x on one side of the equation. Subtract 5 from both sides to get 2x = 6, then divide both sides by 2 to get x = 3. ### The Eiffel Tower is the tallest tower in France.<|eot_id|> The special thing about this sequence is that it is a Fibonacci sequence. Each number after the first two is the sum of the two preceding ones. For example, 5 is the sum of 2 and 3, 8 is the sum of 5 and 3, and so on. This sequence is named after the Italian mathematician Leonardo Fibonacci, who introduced it in the 13th century as a solution to a problem involving the growth of a population of rabbits. The sequence has numerous applications in mathematics, computer science, and other fields, including number theory, algebra, and geometry. It is also used in various areas of Unsloth: Merging 4bit and LoRA weights to 16bit... Unsloth: Will use up to 200.98 out of 251.51 RAM for saving. Unsloth: Saving model... This might take 5 minutes ... Unsloth: Saving tokenizer... Done. Done. ==((====))== Unsloth: Conversion from QLoRA to GGUF information \\ /| [0] Installing llama.cpp might take 3 minutes. O^O/ \_/ \ [1] Converting HF to GGUF 16bits might take 3 minutes. \ / [2] Converting GGUF 16bits to ['q8_0'] might take 10 minutes each. "-____-" In total, you will have to wait at least 16 minutes. Unsloth: Installing llama.cpp. This might take 3 minutes... Unsloth: [1] Converting model at model into q8_0 GGUF format. The output location will be /home/alperen/grpo/model/unsloth.Q8_0.gguf This might take 3 minutes... INFO:hf-to-gguf:Loading model: model INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only INFO:hf-to-gguf:Set model parameters INFO:hf-to-gguf:gguf: context length = 131072 INFO:hf-to-gguf:gguf: embedding length = 4096 INFO:hf-to-gguf:gguf: feed forward length = 14336 INFO:hf-to-gguf:gguf: head count = 32 INFO:hf-to-gguf:gguf: key-value head count = 8 INFO:hf-to-gguf:gguf: rope theta = 500000.0 INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05 INFO:hf-to-gguf:gguf: file type = 7 INFO:hf-to-gguf:Set model tokenizer WARNING:gguf.vocab:Adding merges requested but no merges found, output may be non-functional. INFO:gguf.vocab:Setting special token type bos to 128000 INFO:gguf.vocab:Setting special token type eos to 128009 INFO:gguf.vocab:Setting special token type pad to 128004 INFO:gguf.vocab:Setting add_bos_token to True INFO:hf-to-gguf:Exporting model... INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json' INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00004.safetensors' INFO:hf-to-gguf:token_embd.weight, torch.bfloat16 --> Q8_0, shape = {4096, 128256} INFO:hf-to-gguf:blk.0.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.0.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.0.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.0.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.0.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.0.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.0.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.0.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.0.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.1.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.1.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.1.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.1.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.1.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.1.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.1.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.1.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.1.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.2.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.2.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.2.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.2.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.2.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.2.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.2.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.2.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.2.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.3.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.3.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.3.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.3.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.3.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.3.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.3.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.3.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.3.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.4.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.4.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.4.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.4.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.4.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.4.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.4.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.4.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.4.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.5.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.5.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.5.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.5.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.5.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.5.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.5.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.5.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.5.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.6.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.6.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.6.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.6.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.6.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.6.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.6.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.6.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.6.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.7.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.7.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.7.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.7.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.7.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.7.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.7.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.7.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.7.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.8.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.8.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.8.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.8.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.8.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.8.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.8.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.8.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.8.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:gguf: loading model part 'model-00002-of-00004.safetensors' INFO:hf-to-gguf:blk.10.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.10.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.10.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.10.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.10.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.10.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.10.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.10.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.10.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.11.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.11.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.11.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.11.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.11.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.11.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.11.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.11.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.11.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.12.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.12.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.12.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.12.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.12.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.12.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.12.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.12.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.12.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.13.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.13.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.13.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.13.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.13.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.13.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.13.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.13.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.13.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.14.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.14.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.14.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.14.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.14.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.14.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.14.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.14.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.14.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.15.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.15.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.15.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.15.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.15.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.15.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.15.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.15.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.15.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.16.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.16.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.16.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.16.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.16.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.16.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.16.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.16.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.16.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.17.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.17.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.17.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.17.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.17.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.17.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.17.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.17.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.17.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.18.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.18.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.18.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.18.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.18.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.18.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.18.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.18.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.18.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.19.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.19.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.19.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.19.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.19.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.19.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.19.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.19.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.19.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.20.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.20.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.20.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.20.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.20.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.9.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.9.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.9.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.9.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.9.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.9.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.9.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.9.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.9.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:gguf: loading model part 'model-00003-of-00004.safetensors' INFO:hf-to-gguf:blk.20.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.20.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.20.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.20.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.21.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.21.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.21.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.21.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.21.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.21.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.21.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.21.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.21.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.22.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.22.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.22.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.22.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.22.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.22.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.22.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.22.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.22.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.23.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.23.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.23.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.23.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.23.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.23.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.23.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.23.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.23.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.24.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.24.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.24.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.24.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.24.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.24.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.24.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.24.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.24.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.25.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.25.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.25.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.25.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.25.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.25.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.25.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.25.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.25.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.26.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.26.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.26.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.26.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.26.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.26.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.26.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.26.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.26.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.27.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.27.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.27.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.27.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.27.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.27.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.27.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.27.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.27.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.28.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.28.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.28.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.28.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.28.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.28.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.28.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.28.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.28.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.29.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.29.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.29.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.29.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.29.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.29.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.29.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.29.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.29.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.30.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.30.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.30.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.30.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.30.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.30.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.30.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.30.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.30.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.31.ffn_gate.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.31.ffn_up.weight, torch.bfloat16 --> Q8_0, shape = {4096, 14336} INFO:hf-to-gguf:blk.31.attn_k.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:blk.31.attn_output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.31.attn_q.weight, torch.bfloat16 --> Q8_0, shape = {4096, 4096} INFO:hf-to-gguf:blk.31.attn_v.weight, torch.bfloat16 --> Q8_0, shape = {4096, 1024} INFO:hf-to-gguf:gguf: loading model part 'model-00004-of-00004.safetensors' INFO:hf-to-gguf:output.weight, torch.bfloat16 --> Q8_0, shape = {4096, 128256} INFO:hf-to-gguf:blk.31.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:blk.31.ffn_down.weight, torch.bfloat16 --> Q8_0, shape = {14336, 4096} INFO:hf-to-gguf:blk.31.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:hf-to-gguf:output_norm.weight, torch.bfloat16 --> F32, shape = {4096} INFO:gguf.gguf_writer:Writing the following files: INFO:gguf.gguf_writer:/home/alperen/grpo/model/unsloth.Q8_0.gguf: n_tensors = 291, total_size = 8.5G Writing: 0%| | 0.00/8.53G [00:00<?, ?byte/s] Writing: 7%|▋ | 558M/8.53G [00:05<01:19, 99.8Mbyte/s] Writing: 7%|▋ | 621M/8.53G [00:06<01:19, 99.9Mbyte/s] Writing: 8%|▊ | 683M/8.53G [00:06<01:17, 101Mbyte/s] Writing: 9%|▊ | 745M/8.53G [00:07<01:16, 102Mbyte/s] Writing: 9%|▉ | 768M/8.53G [00:07<01:15, 102Mbyte/s] Writing: 9%|▉ | 785M/8.53G [00:07<01:15, 103Mbyte/s] Writing: 10%|▉ | 852M/8.53G [00:08<01:15, 102Mbyte/s] Writing: 11%|█ | 915M/8.53G [00:09<01:13, 104Mbyte/s] Writing: 11%|█▏ | 977M/8.53G [00:09<01:12, 104Mbyte/s] Writing: 12%|█▏ | 999M/8.53G [00:09<01:11, 105Mbyte/s] Writing: 12%|█▏ | 1.02G/8.53G [00:09<01:11, 105Mbyte/s] Writing: 13%|█▎ | 1.08G/8.53G [00:10<01:11, 104Mbyte/s] Writing: 13%|█▎ | 1.15G/8.53G [00:11<01:10, 105Mbyte/s] Writing: 14%|█▍ | 1.21G/8.53G [00:11<01:09, 105Mbyte/s] Writing: 14%|█▍ | 1.23G/8.53G [00:12<01:09, 106Mbyte/s] Writing: 15%|█▍ | 1.25G/8.53G [00:12<01:08, 106Mbyte/s] Writing: 15%|█▌ | 1.32G/8.53G [00:12<01:09, 104Mbyte/s] Writing: 16%|█▌ | 1.38G/8.53G [00:13<01:08, 105Mbyte/s] Writing: 17%|█▋ | 1.44G/8.53G [00:14<01:07, 105Mbyte/s] Writing: 17%|█▋ | 1.46G/8.53G [00:14<01:07, 105Mbyte/s] Writing: 17%|█▋ | 1.48G/8.53G [00:14<01:07, 105Mbyte/s] Writing: 18%|█▊ | 1.55G/8.53G [00:15<01:07, 103Mbyte/s] Writing: 19%|█▉ | 1.61G/8.53G [00:15<01:06, 104Mbyte/s] Writing: 20%|█▉ | 1.67G/8.53G [00:16<01:05, 105Mbyte/s] Writing: 20%|█▉ | 1.69G/8.53G [00:16<01:04, 105Mbyte/s] Writing: 20%|██ | 1.71G/8.53G [00:16<01:04, 105Mbyte/s] Writing: 21%|██ | 1.78G/8.53G [00:17<01:05, 104Mbyte/s] Writing: 22%|██▏ | 1.84G/8.53G [00:17<01:03, 105Mbyte/s] Writing: 22%|██▏ | 1.90G/8.53G [00:18<01:03, 105Mbyte/s] Writing: 23%|██▎ | 1.93G/8.53G [00:18<01:02, 106Mbyte/s] Writing: 23%|██▎ | 1.94G/8.53G [00:18<01:02, 106Mbyte/s] Writing: 24%|██▎ | 2.01G/8.53G [00:19<01:03, 103Mbyte/s] Writing: 24%|██▍ | 2.07G/8.53G [00:20<01:01, 105Mbyte/s] Writing: 25%|██▌ | 2.14G/8.53G [00:20<01:00, 105Mbyte/s] Writing: 25%|██▌ | 2.16G/8.53G [00:20<01:00, 106Mbyte/s] Writing: 26%|██▌ | 2.18G/8.53G [00:21<01:00, 106Mbyte/s] Writing: 26%|██▋ | 2.24G/8.53G [00:21<01:00, 104Mbyte/s] Writing: 27%|██▋ | 2.31G/8.53G [00:22<00:59, 105Mbyte/s] Writing: 28%|██▊ | 2.37G/8.53G [00:22<00:58, 105Mbyte/s] Writing: 28%|██▊ | 2.39G/8.53G [00:23<00:58, 106Mbyte/s] Writing: 28%|██▊ | 2.41G/8.53G [00:23<00:57, 106Mbyte/s] Writing: 29%|██▉ | 2.47G/8.53G [00:23<00:58, 104Mbyte/s] Writing: 30%|██▉ | 2.54G/8.53G [00:24<00:57, 104Mbyte/s] Writing: 30%|███ | 2.60G/8.53G [00:25<00:56, 104Mbyte/s] Writing: 31%|███ | 2.62G/8.53G [00:25<00:56, 105Mbyte/s] Writing: 31%|███ | 2.64G/8.53G [00:25<00:56, 105Mbyte/s] Writing: 32%|███▏ | 2.71G/8.53G [00:26<00:59, 98.0Mbyte/s] Writing: 32%|███▏ | 2.77G/8.53G [00:26<00:57, 101Mbyte/s] Writing: 33%|███▎ | 2.83G/8.53G [00:27<00:55, 103Mbyte/s] Writing: 33%|███▎ | 2.85G/8.53G [00:27<00:54, 104Mbyte/s] Writing: 34%|███▎ | 2.87G/8.53G [00:27<00:54, 104Mbyte/s] Writing: 34%|███▍ | 2.94G/8.53G [00:28<00:54, 103Mbyte/s] Writing: 35%|███▌ | 3.00G/8.53G [00:28<00:53, 104Mbyte/s] Writing: 36%|███▌ | 3.06G/8.53G [00:29<00:52, 105Mbyte/s] Writing: 36%|███▌ | 3.09G/8.53G [00:29<00:51, 105Mbyte/s] Writing: 36%|███▋ | 3.10G/8.53G [00:29<00:51, 105Mbyte/s] Writing: 37%|███▋ | 3.17G/8.53G [00:30<00:51, 104Mbyte/s] Writing: 38%|███▊ | 3.23G/8.53G [00:31<00:50, 105Mbyte/s] Writing: 39%|███▊ | 3.29G/8.53G [00:31<00:49, 105Mbyte/s] Writing: 39%|███▉ | 3.32G/8.53G [00:31<00:49, 106Mbyte/s] Writing: 39%|███▉ | 3.33G/8.53G [00:32<00:49, 106Mbyte/s] Writing: 40%|███▉ | 3.40G/8.53G [00:32<00:49, 104Mbyte/s] Writing: 41%|████ | 3.46G/8.53G [00:33<00:48, 104Mbyte/s] Writing: 41%|████▏ | 3.53G/8.53G [00:34<00:47, 104Mbyte/s] Writing: 42%|████▏ | 3.55G/8.53G [00:34<00:47, 105Mbyte/s] Writing: 42%|████▏ | 3.57G/8.53G [00:34<00:47, 104Mbyte/s] Writing: 43%|████▎ | 3.63G/8.53G [00:35<00:47, 102Mbyte/s] Writing: 43%|████▎ | 3.70G/8.53G [00:35<00:46, 104Mbyte/s] Writing: 44%|████▍ | 3.76G/8.53G [00:36<00:45, 104Mbyte/s] Writing: 44%|████▍ | 3.78G/8.53G [00:36<00:45, 105Mbyte/s] Writing: 45%|████▍ | 3.80G/8.53G [00:36<00:45, 105Mbyte/s] Writing: 45%|████▌ | 3.87G/8.53G [00:37<00:45, 103Mbyte/s] Writing: 46%|████▌ | 3.93G/8.53G [00:37<00:44, 104Mbyte/s] Writing: 47%|████▋ | 3.99G/8.53G [00:38<00:43, 105Mbyte/s] Writing: 47%|████▋ | 4.01G/8.53G [00:38<00:42, 105Mbyte/s] Writing: 47%|████▋ | 4.03G/8.53G [00:38<00:42, 105Mbyte/s] Writing: 48%|████▊ | 4.10G/8.53G [00:39<00:42, 103Mbyte/s] Writing: 49%|████▊ | 4.16G/8.53G [00:40<00:41, 104Mbyte/s] Writing: 49%|████▉ | 4.22G/8.53G [00:40<00:41, 105Mbyte/s] Writing: 50%|████▉ | 4.24G/8.53G [00:40<00:40, 105Mbyte/s] Writing: 50%|████▉ | 4.26G/8.53G [00:41<00:40, 105Mbyte/s] Writing: 51%|█████ | 4.33G/8.53G [00:41<00:40, 104Mbyte/s] Writing: 51%|█████▏ | 4.39G/8.53G [00:42<00:39, 104Mbyte/s] Writing: 52%|█████▏ | 4.45G/8.53G [00:42<00:38, 105Mbyte/s] Writing: 52%|█████▏ | 4.48G/8.53G [00:43<00:38, 106Mbyte/s] Writing: 53%|█████▎ | 4.49G/8.53G [00:43<00:38, 105Mbyte/s] Writing: 53%|█████▎ | 4.56G/8.53G [00:43<00:38, 104Mbyte/s] Writing: 54%|█████▍ | 4.62G/8.53G [00:44<00:37, 105Mbyte/s] Writing: 55%|█████▍ | 4.69G/8.53G [00:45<00:36, 104Mbyte/s] Writing: 55%|█████▌ | 4.71G/8.53G [00:45<00:36, 105Mbyte/s] Writing: 55%|█████▌ | 4.73G/8.53G [00:45<00:36, 105Mbyte/s] Writing: 56%|█████▌ | 4.79G/8.53G [00:46<00:36, 104Mbyte/s] Writing: 57%|█████▋ | 4.85G/8.53G [00:46<00:35, 104Mbyte/s] Writing: 58%|█████▊ | 4.92G/8.53G [00:47<00:34, 105Mbyte/s] Writing: 58%|█████▊ | 4.94G/8.53G [00:47<00:34, 105Mbyte/s] Writing: 58%|█████▊ | 4.96G/8.53G [00:47<00:33, 105Mbyte/s] Writing: 59%|█████▉ | 5.02G/8.53G [00:48<00:33, 106Mbyte/s] Writing: 59%|█████▉ | 5.05G/8.53G [00:48<00:32, 106Mbyte/s] Writing: 59%|█████▉ | 5.06G/8.53G [00:48<00:32, 106Mbyte/s] Writing: 60%|██████ | 5.13G/8.53G [00:49<00:32, 104Mbyte/s] Writing: 61%|██████ | 5.19G/8.53G [00:49<00:31, 105Mbyte/s] Writing: 62%|██████▏ | 5.26G/8.53G [00:50<00:31, 105Mbyte/s] Writing: 62%|██████▏ | 5.28G/8.53G [00:50<00:30, 106Mbyte/s] Writing: 62%|██████▏ | 5.30G/8.53G [00:50<00:30, 106Mbyte/s] Writing: 63%|██████▎ | 5.36G/8.53G [00:51<00:32, 98.7Mbyte/s] Writing: 64%|██████▎ | 5.43G/8.53G [00:52<00:30, 102Mbyte/s] Writing: 64%|██████▍ | 5.49G/8.53G [00:52<00:30, 101Mbyte/s] Writing: 65%|██████▌ | 5.55G/8.53G [00:53<00:28, 103Mbyte/s] Writing: 66%|██████▌ | 5.61G/8.53G [00:54<00:28, 104Mbyte/s] Writing: 66%|██████▌ | 5.63G/8.53G [00:54<00:27, 105Mbyte/s] Writing: 66%|██████▌ | 5.65G/8.53G [00:54<00:27, 105Mbyte/s] Writing: 67%|██████▋ | 5.72G/8.53G [00:55<00:27, 104Mbyte/s] Writing: 68%|██████▊ | 5.78G/8.53G [00:55<00:26, 105Mbyte/s] Writing: 68%|██████▊ | 5.84G/8.53G [00:56<00:25, 105Mbyte/s] Writing: 69%|██████▉ | 5.87G/8.53G [00:56<00:25, 106Mbyte/s] Writing: 69%|██████▉ | 5.88G/8.53G [00:56<00:25, 106Mbyte/s] Writing: 70%|██████▉ | 5.95G/8.53G [00:57<00:24, 104Mbyte/s] Writing: 70%|███████ | 6.01G/8.53G [00:57<00:24, 105Mbyte/s] Writing: 71%|███████ | 6.08G/8.53G [00:58<00:23, 105Mbyte/s] Writing: 71%|███████▏ | 6.10G/8.53G [00:58<00:22, 106Mbyte/s] Writing: 72%|███████▏ | 6.12G/8.53G [00:58<00:22, 106Mbyte/s] Writing: 72%|███████▏ | 6.18G/8.53G [00:59<00:22, 104Mbyte/s] Writing: 73%|███████▎ | 6.25G/8.53G [01:00<00:21, 105Mbyte/s] Writing: 74%|███████▍ | 6.31G/8.53G [01:00<00:21, 105Mbyte/s] Writing: 74%|███████▍ | 6.33G/8.53G [01:00<00:20, 106Mbyte/s] Writing: 74%|███████▍ | 6.35G/8.53G [01:01<00:20, 106Mbyte/s] Writing: 75%|███████▌ | 6.41G/8.53G [01:01<00:20, 104Mbyte/s] Writing: 76%|███████▌ | 6.48G/8.53G [01:02<00:19, 105Mbyte/s] Writing: 77%|███████▋ | 6.54G/8.53G [01:02<00:19, 105Mbyte/s] Writing: 77%|███████▋ | 6.56G/8.53G [01:03<00:18, 105Mbyte/s] Writing: 77%|███████▋ | 6.58G/8.53G [01:03<00:18, 105Mbyte/s] Writing: 78%|███████▊ | 6.65G/8.53G [01:03<00:18, 104Mbyte/s] Writing: 79%|███████▊ | 6.71G/8.53G [01:04<00:17, 105Mbyte/s] Writing: 79%|███████▉ | 6.77G/8.53G [01:05<00:16, 105Mbyte/s] Writing: 80%|███████▉ | 6.79G/8.53G [01:05<00:16, 106Mbyte/s] Writing: 80%|███████▉ | 6.81G/8.53G [01:05<00:16, 106Mbyte/s] Writing: 81%|████████ | 6.88G/8.53G [01:06<00:15, 104Mbyte/s] Writing: 81%|████████▏ | 6.94G/8.53G [01:06<00:15, 105Mbyte/s] Writing: 82%|████████▏ | 7.00G/8.53G [01:07<00:14, 106Mbyte/s] Writing: 82%|████████▏ | 7.03G/8.53G [01:07<00:14, 106Mbyte/s] Writing: 83%|████████▎ | 7.04G/8.53G [01:07<00:14, 106Mbyte/s] Writing: 83%|████████▎ | 7.11G/8.53G [01:08<00:13, 104Mbyte/s] Writing: 84%|████████▍ | 7.17G/8.53G [01:08<00:12, 105Mbyte/s] Writing: 85%|████████▍ | 7.23G/8.53G [01:09<00:12, 106Mbyte/s] Writing: 85%|████████▌ | 7.26G/8.53G [01:09<00:12, 106Mbyte/s] Writing: 85%|████████▌ | 7.27G/8.53G [01:09<00:11, 106Mbyte/s] Writing: 86%|████████▌ | 7.34G/8.53G [01:10<00:11, 104Mbyte/s] Writing: 87%|████████▋ | 7.40G/8.53G [01:11<00:10, 105Mbyte/s] Writing: 88%|████████▊ | 7.47G/8.53G [01:11<00:10, 106Mbyte/s] Writing: 88%|████████▊ | 7.49G/8.53G [01:11<00:09, 106Mbyte/s] Writing: 88%|████████▊ | 7.51G/8.53G [01:12<00:09, 106Mbyte/s] Writing: 89%|████████▉ | 7.57G/8.53G [01:12<00:09, 104Mbyte/s] Writing: 89%|████████▉ | 7.64G/8.53G [01:13<00:08, 105Mbyte/s] Writing: 90%|█████████ | 7.70G/8.53G [01:13<00:07, 105Mbyte/s] Writing: 90%|█████████ | 7.72G/8.53G [01:14<00:07, 105Mbyte/s] Writing: 91%|█████████ | 7.74G/8.53G [01:14<00:07, 105Mbyte/s] Writing: 91%|█████████▏| 7.81G/8.53G [01:14<00:06, 105Mbyte/s] Writing: 92%|█████████▏| 7.87G/8.53G [01:15<00:06, 105Mbyte/s] Writing: 92%|█████████▏| 7.89G/8.53G [01:15<00:06, 106Mbyte/s] Writing: 93%|█████████▎| 7.91G/8.53G [01:15<00:05, 106Mbyte/s] Writing: 99%|█████████▉| 8.47G/8.53G [01:21<00:00, 101Mbyte/s] Writing: 100%|█████████▉| 8.53G/8.53G [01:22<00:00, 101Mbyte/s] Writing: 100%|██████████| 8.53G/8.53G [01:22<00:00, 104Mbyte/s] INFO:hf-to-gguf:Model successfully exported to /home/alperen/grpo/model/unsloth.Q8_0.gguf Unsloth: Conversion completed! Output location: /home/alperen/grpo/model/unsloth.Q8_0.gguf Unsloth: Saved Ollama Modelfile to model/Modelfile [1;34mwandb[0m: [1;34mwandb[0m: 🚀 View run [33msft_train[0m at: [34mhttps://wandb.ai/alperenyildiz-nus/R4VD_Training/runs/rxcxxtd3[0m [1;34mwandb[0m: Find logs at: [1;35mwandb/run-20250531_205259-rxcxxtd3/logs[0m ` I am using the original Llama3_(8B)_Ollama.ipynb from Unsloth. Here are the dependencies I am using: ` accelerate==1.7.0 aiohappyeyeballs==2.4.4 aiohttp==3.10.11 aiosignal==1.3.1 annotated-types==0.7.0 anyio==4.5.2 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 arrow==1.3.0 asttokens==3.0.0 async-lru==2.0.5 async-timeout==4.0.3 attrs==25.1.0 babel==2.17.0 beautifulsoup4==4.13.4 bitsandbytes==0.45.5 bleach==6.2.0 certifi==2025.4.26 cffi==1.17.1 charset-normalizer==3.4.2 click==8.2.1 comm==0.2.2 cut-cross-entropy==25.1.1 dataclasses-json==0.6.7 datasets==3.6.0 debugpy==1.8.14 decorator==5.2.1 defusedxml==0.7.1 diffusers==0.33.1 dill==0.3.8 distro==1.9.0 docker-pycreds==0.4.0 docstring_parser==0.16 exceptiongroup==1.2.2 executing==2.2.0 faiss-cpu==1.8.0.post1 fastjsonschema==2.21.1 filelock==3.18.0 fqdn==1.5.1 frozenlist==1.5.0 fsspec==2025.3.0 gguf==0.16.3 gitdb==4.0.12 GitPython==3.1.44 greenlet==3.1.1 h11==0.14.0 hf-xet==1.1.2 hf_transfer==0.1.9 httpcore==1.0.7 httpx==0.28.1 huggingface-hub==0.32.3 idna==3.10 importlib_metadata==8.7.0 ipykernel==6.29.5 ipython==9.2.0 ipython_pygments_lexers==1.1.1 ipywidgets==8.1.7 isoduration==20.11.0 jedi==0.19.2 Jinja2==3.1.6 jiter==0.8.2 joblib==1.5.1 json5==0.12.0 jsonpatch==1.33 jsonpointer==3.0.0 jsonschema==4.24.0 jsonschema-specifications==2025.4.1 jupyter==1.1.1 jupyter-console==6.6.3 jupyter-events==0.12.0 jupyter-lsp==2.2.5 jupyter_client==8.6.3 jupyter_core==5.8.1 jupyter_server==2.16.0 jupyter_server_terminals==0.5.3 jupyterlab==4.4.3 jupyterlab_pygments==0.3.0 jupyterlab_server==2.27.3 jupyterlab_widgets==3.0.15 langchain==0.2.17 langchain-community==0.2.19 langchain-core==0.2.43 langchain-ollama==0.1.3 langchain-openai==0.1.25 langchain-text-splitters==0.2.4 langsmith==0.1.147 markdown-it-py==3.0.0 MarkupSafe==3.0.2 marshmallow==3.22.0 matplotlib-inline==0.1.7 mdurl==0.1.2 mistune==3.1.3 mpmath==1.3.0 msgspec==0.19.0 multidict==6.1.0 multiprocess==0.70.16 mypy-extensions==1.0.0 nbclient==0.10.2 nbconvert==7.16.6 nbformat==5.10.4 nest-asyncio==1.6.0 networkx==3.4.2 notebook==7.4.3 notebook_shim==0.2.4 numpy==2.0.2 nvidia-cublas-cu12==12.6.4.1 nvidia-cuda-cupti-cu12==12.6.80 nvidia-cuda-nvrtc-cu12==12.6.77 nvidia-cuda-runtime-cu12==12.6.77 nvidia-cudnn-cu12==9.5.1.17 nvidia-cufft-cu12==11.3.0.4 nvidia-cufile-cu12==1.11.1.6 nvidia-curand-cu12==10.3.7.77 nvidia-cusolver-cu12==11.7.1.2 nvidia-cusparse-cu12==12.5.4.2 nvidia-cusparselt-cu12==0.6.3 nvidia-nccl-cu12==2.26.2 nvidia-nvjitlink-cu12==12.6.85 nvidia-nvtx-cu12==12.6.77 ollama==0.4.7 openai==1.65.5 orjson==3.10.15 overrides==7.7.0 packaging==25.0 pandas==2.2.3 pandocfilters==1.5.1 parso==0.8.4 peft==0.15.2 pexpect==4.9.0 pillow==11.2.1 platformdirs==4.3.8 prometheus_client==0.22.0 prompt_toolkit==3.0.51 propcache==0.2.0 protobuf==3.20.3 psutil==7.0.0 ptyprocess==0.7.0 pure_eval==0.2.3 pyarrow==20.0.0 pycparser==2.22 pydantic==2.10.6 pydantic_core==2.27.2 Pygments==2.19.1 PyMuPDF==1.24.11 python-dateutil==2.9.0.post0 python-json-logger==3.3.0 pytz==2025.2 PyYAML==6.0.2 pyzmq==26.4.0 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 requests-toolbelt==1.0.0 rfc3339-validator==0.1.4 rfc3986-validator==0.1.1 rich==14.0.0 rpds-py==0.25.1 safetensors==0.5.3 scikit-learn==1.6.1 scipy==1.15.3 Send2Trash==1.8.3 sentence-transformers==4.1.0 sentencepiece==0.2.0 sentry-sdk==2.29.1 setproctitle==1.3.6 shtab==1.7.2 six==1.17.0 smmap==5.0.2 sniffio==1.3.1 soupsieve==2.7 SQLAlchemy==2.0.38 stack-data==0.6.3 sympy==1.14.0 tenacity==8.5.0 terminado==0.18.1 threadpoolctl==3.6.0 tiktoken==0.7.0 tinycss2==1.4.0 tokenizers==0.21.1 torch==2.7.0 torchvision==0.22.0 tornado==6.5.1 tqdm==4.67.1 traitlets==5.14.3 transformers==4.52.4 triton==3.3.0 trl==0.15.2 typeguard==4.4.2 types-python-dateutil==2.9.0.20250516 typing-inspect==0.9.0 typing_extensions==4.13.2 tyro==0.9.21 tzdata==2025.2 unsloth @ git+https://github.com/unslothai/unsloth.git@beef0cbcb6ecf1fa126589bd2877be85a91bfb8f unsloth_zoo==2025.5.11 uri-template==1.3.0 urllib3==2.4.0 wandb==0.19.11 wcwidth==0.2.13 webcolors==24.11.1 webencodings==0.5.1 websocket-client==1.8.0 widgetsnbextension==4.0.14 xformers==0.0.30 xxhash==3.5.0 yarl==1.15.2 zipp==3.21.0 ` ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.5.7

GiteaMirror added the bug label 2026-05-04 17:35:42 -05:00

GiteaMirror commented 2026-05-04 17:35:43 -05:00

Author

Owner

Copy Link

@rick-github commented on GitHub (May 31, 2025):

Does updating ollama help?

 @rick-github commented on GitHub (May 31, 2025): Does [updating ollama](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-upgrade-ollama) help?

GiteaMirror commented 2026-05-04 17:35:44 -05:00

Author

Owner

Copy Link

@alperen21 commented on GitHub (Jun 1, 2025):

Does updating ollama help?

upgrading to 0.9.0 only changes the error to:
Error: unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970

Ollama create model outputs
gathering model components copying file sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 100% parsing GGUF using existing layer sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 using existing layer sha256:95b5361453780fb5797ce5abfe9a330f5d33fdec13d2232ef1443ee0c3a86ecc using existing layer sha256:a00752320fd9088ddeea7cc185c72564737eb377034554e0bc7fa1cdf69ab36f writing manifest success

 @alperen21 commented on GitHub (Jun 1, 2025): > Does [updating ollama](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-upgrade-ollama) help? upgrading to 0.9.0 only changes the error to: ` Error: unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 ` Ollama create model outputs ` gathering model components copying file sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 100% parsing GGUF using existing layer sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 using existing layer sha256:95b5361453780fb5797ce5abfe9a330f5d33fdec13d2232ef1443ee0c3a86ecc using existing layer sha256:a00752320fd9088ddeea7cc185c72564737eb377034554e0bc7fa1cdf69ab36f writing manifest success `

GiteaMirror commented 2026-05-04 17:35:46 -05:00

Author

Owner

Copy Link

@rick-github commented on GitHub (Jun 1, 2025):

Server logs may aid in debugging.

 @rick-github commented on GitHub (Jun 1, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging.

GiteaMirror commented 2026-05-04 17:35:48 -05:00

Author

Owner

Copy Link

@alperen21 commented on GitHub (Jun 1, 2025):

Server logs may aid in debugging.

Nothing is logged:

journalctl -u ollama --no-pager --follow --pager-end Hint: You are currently not seeing messages from other users and the system. Users in groups 'adm', 'systemd-journal' can see all messages. Pass -q to turn off this notice. -- Logs begin at Sat 2025-04-26 03:57:25 +08. --

 @alperen21 commented on GitHub (Jun 1, 2025): > [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may aid in debugging. Nothing is logged: ` journalctl -u ollama --no-pager --follow --pager-end Hint: You are currently not seeing messages from other users and the system. Users in groups 'adm', 'systemd-journal' can see all messages. Pass -q to turn off this notice. -- Logs begin at Sat 2025-04-26 03:57:25 +08. -- `

GiteaMirror commented 2026-05-04 17:35:48 -05:00

Author

Owner

Copy Link

@rick-github commented on GitHub (Jun 1, 2025):

sudo journalctl -u ollama --no-pager -S today

 @rick-github commented on GitHub (Jun 1, 2025): ``` sudo journalctl -u ollama --no-pager -S today ```

GiteaMirror commented 2026-05-04 17:35:48 -05:00

Author

Owner

Copy Link

@alperen21 commented on GitHub (Jun 1, 2025):

sudo journalctl -u ollama --no-pager -S today

-- Logs begin at Fri 2025-04-25 23:59:59 +08, end at Sun 2025-06-01 15:19:55 +08. --
Jun 01 13:52:28 i2r-spd-0030576 systemd[1]: ollama.service: Succeeded.
Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: ollama.service: Scheduled restart job, restart counter is at 1.
Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: Stopped Ollama Service.
Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: Started Ollama Service.
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: 2025/06/01 13:52:31 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.963+08:00 level=INFO source=images.go:432 msg="total blobs: 6"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.965+08:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: - using env: export GIN_MODE=release
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: - using code: gin.SetMode(gin.ReleaseMode)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/pull --> github.com/ollama/ollama/server.(*Server).PullHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/generate --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/chat --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/embed --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/embeddings --> github.com/ollama/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/create --> github.com/ollama/ollama/server.(*Server).CreateHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/push --> github.com/ollama/ollama/server.(*Server).PushHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/copy --> github.com/ollama/ollama/server.(*Server).CopyHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] DELETE /api/delete --> github.com/ollama/ollama/server.(*Server).DeleteHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/show --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/ps --> github.com/ollama/ollama/server.(*Server).PsHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/chat/completions --> github.com/ollama/ollama/server.(Server).ChatHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/completions --> github.com/ollama/ollama/server.(Server).GenerateHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/embeddings --> github.com/ollama/ollama/server.(Server).EmbedHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /v1/models --> github.com/ollama/ollama/server.(Server).ListHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /v1/models/:model --> github.com/ollama/ollama/server.(Server).ShowHandler-fm (6 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET / --> github.com/ollama/ollama/server.(Server).GenerateRoutes.func1 (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/tags --> github.com/ollama/ollama/server.(Server).ListHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/version --> github.com/ollama/ollama/server.(Server).GenerateRoutes.func2 (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD / --> github.com/ollama/ollama/server.(Server).GenerateRoutes.func1 (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/tags --> github.com/ollama/ollama/server.(Server).ListHandler-fm (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/version --> github.com/ollama/ollama/server.(Server).GenerateRoutes.func2 (5 handlers)
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=routes.go:1238 msg="Listening on 127.0.0.1:11434 (version 0.5.7)"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=routes.go:1267 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11_avx cuda_v12_avx rocm_avx]"
Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=gpu.go:226 msg="looking for compatible GPUs"
Jun 01 13:52:32 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:32.053+08:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3090" total="23.7 GiB" available="22.2 GiB"
Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Stopping Ollama Service...
Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: ollama.service: Succeeded.
Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Stopped Ollama Service.
Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Started Ollama Service.
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.261+08:00 level=INFO source=routes.go:1234 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost: https://localhost: http://127.0.0.1 https://127.0.0.1 http://127.0.0.1: https://127.0.0.1: http://0.0.0.0 https://0.0.0.0 http://0.0.0.0: https://0.0.0.0: app:// file:// tauri:// vscode-webview:// vscode-file://] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.263+08:00 level=INFO source=images.go:479 msg="total blobs: 6"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=images.go:486 msg="total unused blobs removed: 0"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=routes.go:1287 msg="Listening on 127.0.0.1:11434 (version 0.9.0)"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.368+08:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3090" total="23.7 GiB" available="22.2 GiB"
Jun 01 13:53:24 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:53:24 | 200 | 53.146µs | 127.0.0.1 | GET "/api/version"
Jun 01 13:53:54 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:53:54 | 200 | 18.698µs | 127.0.0.1 | HEAD "/"
Jun 01 13:54:11 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:11 | 200 | 56.416µs | 127.0.0.1 | POST "/api/blobs/sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970"
Jun 01 13:54:11 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:11 | 200 | 12.96016ms | 127.0.0.1 | POST "/api/create"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 200 | 21.979µs | 127.0.0.1 | HEAD "/"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 200 | 17.786978ms | 127.0.0.1 | POST "/api/show"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.806+08:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 gpu=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 parallel=2 available=23845470208 required="9.7 GiB"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.852+08:00 level=INFO source=server.go:135 msg="system memory" total="62.7 GiB" free="39.6 GiB" free_swap="0 B"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.852+08:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[22.2 GiB]" memory.gpu_overhead="0 B" memory.required.full="9.7 GiB" memory.required.partial="9.7 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[9.7 GiB]" memory.weights.total="7.4 GiB" memory.weights.repeating="6.9 GiB" memory.weights.nonrepeating="532.3 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 (version GGUF V3 (latest))
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 0: general.architecture str = llama
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 1: general.name str = model
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 2: llama.block_count u32 = 32
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 3: llama.context_length u32 = 131072
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 10: general.file_type u32 = 7
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", """, "#", "$", "%", "&", "'", ...
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128009
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128004
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 21: general.quantization_version u32 = 2
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type f32: 65 tensors
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type q8_0: 226 tensors
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file format = GGUF V3 (latest)
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file type = Q8_0
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file size = 7.95 GiB (8.50 BPW)
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_load_from_file_impl: failed to load model
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.902+08:00 level=INFO source=sched.go:455 msg="NewLlamaServer failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 error="unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970"
Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 500 | 178.506735ms | 127.0.0.1 | POST "/api/generate"
Jun 01 13:54:46 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:46 | 200 | 22.997µs | 127.0.0.1 | GET "/api/version"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:35 | 200 | 16.303µs | 127.0.0.1 | HEAD "/"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/baseline_:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6a7c51e44480776aaf091a4f31ab4cd153364eb0d041b5a2d32f1845a8feebf2: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-coder-v2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34488e453cfe3232810bac05c55d94a471228086fcac9e6b00ef3a671e21fa66: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3:70b-instruct error="open /usr/share/ollama/.ollama/models/blobs/sha256-ea8e06d28e479230d9ea75e58a9c6fddad874fdb103a242988bf6bda3a49a085: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/plswork:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-bb11382a7789de02dc236a571472c09994b22906770b7cfbb2dfe639f76659f1: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-e934a1db7bb7d7202828298224a280df0c09985d1d35147c301574a20dfe5129: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/smollm-135m-instruct-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6592d34cb09d8fc73587e33c8a10b24009042ef781a61ad77dbcba80d74dae4f: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/testing:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-b94880755682395e160906f09f8aeee5dce2d8515d2e346dfaa55166c10d6f87: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-r1:1.5b error="open /usr/share/ollama/.ollama/models/blobs/sha256-a85fe2a2e58e2426116d3686dfdc1a6ea58640c1e684069976aa730be6c1fa01: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.1:8b error="open /usr/share/ollama/.ollama/models/blobs/sha256-455f34728c9b5dd3376378bfb809ee166c145b0b4c1f1a6feca069055066ef9a: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/phi:2.7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-4ce4b16d33a334b872b8cc4f9d6929905d0bfa19bdc90c5cbed95700d22f747f: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth_valid:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-94e9c8af1fe1dc86a7c901a3fb868164bf35f42ab34a2395c3b630ff8b44f21a: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-coder:1.3b error="open /usr/share/ollama/.ollama/models/blobs/sha256-d55c9eb1669a22f75956872166c676634c77cd8dfb94900640bd09a474dfcd0c: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_bert:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-14977e9a990f2f0b51e30486892df6230413f8027ff76298f4249c2fb97219d8: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_jaccard:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-3d0f8987b16341b857b7a13cde92fc230c4d96a93f7f06a7b929922a95552fd9: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_sft_2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-235236765ff77ca7bf2de13d4dd6f6cd9046e632c8b6f8936f6899bdc5542d92: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llamagrpo_vuln-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-49af9141aa1a0e9e89c252ac918273b9960256468087291b8d2c23436a0469d0: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/gemma:2b error="open /usr/share/ollama/.ollama/models/blobs/sha256-887433b89a901c156f7e6944442f3c9e57f3c55d6ed52042cbb7303aea994290: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mymodel:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-b94880755682395e160906f09f8aeee5dce2d8515d2e346dfaa55166c10d6f87: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mytest3:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-800fec5151b78ad63c6b5dbb73d514c00d5b370c9292e7bb6f824fd707595ba0: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/rlhf:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-43ae35401b26683bfc3a25735ec29e38e32d9d2cda2228dd73090e3dd90563ac: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/rlhf_model:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-eb2c36bd4e64f86cde431fd548624019d9eff04661c643cb9eb035f886303a40: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth_valid_full:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-c306eacce700a3e6dfa98d0c7a197d4eef8363985914191aee565f0717ef9dba: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=hf.co/alperenyildiz/Llama-3.2-1B-Instruct_q8_0_GRPO:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-59413b696cad54fe177854e01e5c2d519a2d52dc24501a063c790419da6bf3d0: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/codellama:7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-316526ac7323d6f42305c5bbf1939e1197487c1e6ea1f01292ceb5e3040b707a: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/hftest:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-1dc2a8b64db29f052b583cfd197c6c3e178f090306d4a6934b8b62e1dd4e94a9: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/qwen:1.8b error="open /usr/share/ollama/.ollama/models/blobs/sha256-9ece4a97bfb61bdb539531db5584fa119ad55684281d8a2d864339ae3fdd6c15: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-8100e67f2b81eac2c8c5beff8524a0797e07f90b90ce99c81b04642f2f7f64e3: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_testing:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6bbd3b8c4d39e9dd39400627de896f7d332c4ad70c53a7ccb0143df14cb5eb3b: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34a733ed52f1629251c5271080fefc67c21308740366ff1c3563634d185be301: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.1:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-1a4c3c319823fdabddb22479d0b10820a7a39fe49e45c40bae28fbe83926dc14: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/ollama_finetuned_grpo:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-0ce3c9184f3cc1c097a3758ef826f707b4f0c6f9848e5ae61ee09eecae1303e4: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/test:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-aefc9828fc72f25eee51b335dab430ab077f9fd0cd10aa80beb041f06e621d04: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_sft:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-79939762984cfb3c300a6bfdcccfa082f5dcff08ca0b0a57d400b0dc5c8a18b0: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_single:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-f7a62ea19aca03083e0efa70074db40736ef4a2d248329813548193b8b10a057: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.2:1b error="open /usr/share/ollama/.ollama/models/blobs/sha256-4f659a1e86d7f5a33c389f7991e7224b7ee6ad0358b53437d54c02d2e1b1118d: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mytest:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-3c866098594e4ec49cb7304dbd5d67b3b71b8c2d0225589d3d6e3d908caa52e8: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34bb5ab01051a11372a91f95f3fbbc51173eed8e7f13ec395b9ae9b8bd0e242b: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mistral:7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-42347cd80dc868877d2807869c0e9c90034392b2f1f001cae1563488021e2e19: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.098+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_test:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-77a2ccc8a0bedde7e1ce052ae894bc75c054e5b8594b514bcbe1e7ea6b85a1c7: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.098+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/smolgrpo_vuln-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-98e18d220d09de6cc3277405c524d54670b48f0d6f0c25cda62d99b185d9b327: no such file or directory"
Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:35 | 200 | 2.020727ms | 127.0.0.1 | GET "/api/tags"
Jun 01 13:58:50 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:50 | 200 | 16.377µs | 127.0.0.1 | HEAD "/"
Jun 01 13:59:28 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:28 | 201 | 22.096522855s | 127.0.0.1 | POST "/api/blobs/sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970"
Jun 01 13:59:28 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:28 | 200 | 162.958513ms | 127.0.0.1 | POST "/api/create"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 200 | 42.797µs | 127.0.0.1 | HEAD "/"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 200 | 26.341074ms | 127.0.0.1 | POST "/api/show"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.653+08:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 gpu=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 parallel=2 available=23828299776 required="9.7 GiB"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.701+08:00 level=INFO source=server.go:135 msg="system memory" total="62.7 GiB" free="39.7 GiB" free_swap="0 B"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.701+08:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[22.2 GiB]" memory.gpu_overhead="0 B" memory.required.full="9.7 GiB" memory.required.partial="9.7 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[9.7 GiB]" memory.weights.total="7.4 GiB" memory.weights.repeating="6.9 GiB" memory.weights.nonrepeating="532.3 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 (version GGUF V3 (latest))
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 0: general.architecture str = llama
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 1: general.name str = model
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 2: llama.block_count u32 = 32
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 3: llama.context_length u32 = 131072
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 10: general.file_type u32 = 7
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", """, "#", "$", "%", "&", "'", ...
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128009
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128004
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 21: general.quantization_version u32 = 2
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type f32: 65 tensors
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type q8_0: 226 tensors
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file format = GGUF V3 (latest)
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file type = Q8_0
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file size = 7.95 GiB (8.50 BPW)
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_load_from_file_impl: failed to load model
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.728+08:00 level=INFO source=sched.go:455 msg="NewLlamaServer failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 error="unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970"
Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 500 | 164.725683ms | 127.0.0.1 | POST "/api/generate"

 @alperen21 commented on GitHub (Jun 1, 2025): > ``` > sudo journalctl -u ollama --no-pager -S today > ``` -- Logs begin at Fri 2025-04-25 23:59:59 +08, end at Sun 2025-06-01 15:19:55 +08. -- Jun 01 13:52:28 i2r-spd-0030576 systemd[1]: ollama.service: Succeeded. Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: ollama.service: Scheduled restart job, restart counter is at 1. Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: Stopped Ollama Service. Jun 01 13:52:31 i2r-spd-0030576 systemd[1]: Started Ollama Service. Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: 2025/06/01 13:52:31 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.963+08:00 level=INFO source=images.go:432 msg="total blobs: 6" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.965+08:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached. Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production. Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: - using env: export GIN_MODE=release Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: - using code: gin.SetMode(gin.ReleaseMode) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/pull --> github.com/ollama/ollama/server.(*Server).PullHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/generate --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/chat --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/embed --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/embeddings --> github.com/ollama/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/create --> github.com/ollama/ollama/server.(*Server).CreateHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/push --> github.com/ollama/ollama/server.(*Server).PushHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/copy --> github.com/ollama/ollama/server.(*Server).CopyHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] DELETE /api/delete --> github.com/ollama/ollama/server.(*Server).DeleteHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/show --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/ps --> github.com/ollama/ollama/server.(*Server).PsHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/chat/completions --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/completions --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] POST /v1/embeddings --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /v1/models --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /v1/models/:model --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (6 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET / --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/tags --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] GET /api/version --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD / --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/tags --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: [GIN-debug] HEAD /api/version --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers) Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=routes.go:1238 msg="Listening on 127.0.0.1:11434 (version 0.5.7)" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=routes.go:1267 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11_avx cuda_v12_avx rocm_avx]" Jun 01 13:52:31 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:31.966+08:00 level=INFO source=gpu.go:226 msg="looking for compatible GPUs" Jun 01 13:52:32 i2r-spd-0030576 ollama[317976]: time=2025-06-01T13:52:32.053+08:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3090" total="23.7 GiB" available="22.2 GiB" Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Stopping Ollama Service... Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: ollama.service: Succeeded. Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Stopped Ollama Service. Jun 01 13:53:15 i2r-spd-0030576 systemd[1]: Started Ollama Service. Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.261+08:00 level=INFO source=routes.go:1234 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.263+08:00 level=INFO source=images.go:479 msg="total blobs: 6" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=images.go:486 msg="total unused blobs removed: 0" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=routes.go:1287 msg="Listening on 127.0.0.1:11434 (version 0.9.0)" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.264+08:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" Jun 01 13:53:15 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:53:15.368+08:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3090" total="23.7 GiB" available="22.2 GiB" Jun 01 13:53:24 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:53:24 | 200 | 53.146µs | 127.0.0.1 | GET "/api/version" Jun 01 13:53:54 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:53:54 | 200 | 18.698µs | 127.0.0.1 | HEAD "/" Jun 01 13:54:11 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:11 | 200 | 56.416µs | 127.0.0.1 | POST "/api/blobs/sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970" Jun 01 13:54:11 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:11 | 200 | 12.96016ms | 127.0.0.1 | POST "/api/create" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 200 | 21.979µs | 127.0.0.1 | HEAD "/" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 200 | 17.786978ms | 127.0.0.1 | POST "/api/show" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.806+08:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 gpu=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 parallel=2 available=23845470208 required="9.7 GiB" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.852+08:00 level=INFO source=server.go:135 msg="system memory" total="62.7 GiB" free="39.6 GiB" free_swap="0 B" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.852+08:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[22.2 GiB]" memory.gpu_overhead="0 B" memory.required.full="9.7 GiB" memory.required.partial="9.7 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[9.7 GiB]" memory.weights.total="7.4 GiB" memory.weights.repeating="6.9 GiB" memory.weights.nonrepeating="532.3 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 (version GGUF V3 (latest)) Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 0: general.architecture str = llama Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 1: general.name str = model Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 2: llama.block_count u32 = 32 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 3: llama.context_length u32 = 131072 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 10: general.file_type u32 = 7 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 11: llama.vocab_size u32 = 128256 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128009 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128004 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 21: general.quantization_version u32 = 2 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type f32: 65 tensors Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type q8_0: 226 tensors Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file format = GGUF V3 (latest) Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file type = Q8_0 Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: print_info: file size = 7.95 GiB (8.50 BPW) Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: llama_model_load_from_file_impl: failed to load model Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:54:16.902+08:00 level=INFO source=sched.go:455 msg="NewLlamaServer failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 error="unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970" Jun 01 13:54:16 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:16 | 500 | 178.506735ms | 127.0.0.1 | POST "/api/generate" Jun 01 13:54:46 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:54:46 | 200 | 22.997µs | 127.0.0.1 | GET "/api/version" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:35 | 200 | 16.303µs | 127.0.0.1 | HEAD "/" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/baseline_:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6a7c51e44480776aaf091a4f31ab4cd153364eb0d041b5a2d32f1845a8feebf2: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-coder-v2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34488e453cfe3232810bac05c55d94a471228086fcac9e6b00ef3a671e21fa66: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3:70b-instruct error="open /usr/share/ollama/.ollama/models/blobs/sha256-ea8e06d28e479230d9ea75e58a9c6fddad874fdb103a242988bf6bda3a49a085: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/plswork:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-bb11382a7789de02dc236a571472c09994b22906770b7cfbb2dfe639f76659f1: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-e934a1db7bb7d7202828298224a280df0c09985d1d35147c301574a20dfe5129: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/smollm-135m-instruct-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6592d34cb09d8fc73587e33c8a10b24009042ef781a61ad77dbcba80d74dae4f: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/testing:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-b94880755682395e160906f09f8aeee5dce2d8515d2e346dfaa55166c10d6f87: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-r1:1.5b error="open /usr/share/ollama/.ollama/models/blobs/sha256-a85fe2a2e58e2426116d3686dfdc1a6ea58640c1e684069976aa730be6c1fa01: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.1:8b error="open /usr/share/ollama/.ollama/models/blobs/sha256-455f34728c9b5dd3376378bfb809ee166c145b0b4c1f1a6feca069055066ef9a: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/phi:2.7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-4ce4b16d33a334b872b8cc4f9d6929905d0bfa19bdc90c5cbed95700d22f747f: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth_valid:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-94e9c8af1fe1dc86a7c901a3fb868164bf35f42ab34a2395c3b630ff8b44f21a: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/deepseek-coder:1.3b error="open /usr/share/ollama/.ollama/models/blobs/sha256-d55c9eb1669a22f75956872166c676634c77cd8dfb94900640bd09a474dfcd0c: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_bert:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-14977e9a990f2f0b51e30486892df6230413f8027ff76298f4249c2fb97219d8: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_jaccard:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-3d0f8987b16341b857b7a13cde92fc230c4d96a93f7f06a7b929922a95552fd9: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_sft_2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-235236765ff77ca7bf2de13d4dd6f6cd9046e632c8b6f8936f6899bdc5542d92: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llamagrpo_vuln-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-49af9141aa1a0e9e89c252ac918273b9960256468087291b8d2c23436a0469d0: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/gemma:2b error="open /usr/share/ollama/.ollama/models/blobs/sha256-887433b89a901c156f7e6944442f3c9e57f3c55d6ed52042cbb7303aea994290: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mymodel:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-b94880755682395e160906f09f8aeee5dce2d8515d2e346dfaa55166c10d6f87: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mytest3:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-800fec5151b78ad63c6b5dbb73d514c00d5b370c9292e7bb6f824fd707595ba0: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/rlhf:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-43ae35401b26683bfc3a25735ec29e38e32d9d2cda2228dd73090e3dd90563ac: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/rlhf_model:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-eb2c36bd4e64f86cde431fd548624019d9eff04661c643cb9eb035f886303a40: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama_unsloth_valid_full:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-c306eacce700a3e6dfa98d0c7a197d4eef8363985914191aee565f0717ef9dba: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=hf.co/alperenyildiz/Llama-3.2-1B-Instruct_q8_0_GRPO:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-59413b696cad54fe177854e01e5c2d519a2d52dc24501a063c790419da6bf3d0: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/codellama:7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-316526ac7323d6f42305c5bbf1939e1197487c1e6ea1f01292ceb5e3040b707a: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/hftest:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-1dc2a8b64db29f052b583cfd197c6c3e178f090306d4a6934b8b62e1dd4e94a9: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/qwen:1.8b error="open /usr/share/ollama/.ollama/models/blobs/sha256-9ece4a97bfb61bdb539531db5584fa119ad55684281d8a2d864339ae3fdd6c15: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-8100e67f2b81eac2c8c5beff8524a0797e07f90b90ce99c81b04642f2f7f64e3: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_testing:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-6bbd3b8c4d39e9dd39400627de896f7d332c4ad70c53a7ccb0143df14cb5eb3b: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_valid_llama:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34a733ed52f1629251c5271080fefc67c21308740366ff1c3563634d185be301: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.1:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-1a4c3c319823fdabddb22479d0b10820a7a39fe49e45c40bae28fbe83926dc14: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/ollama_finetuned_grpo:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-0ce3c9184f3cc1c097a3758ef826f707b4f0c6f9848e5ae61ee09eecae1303e4: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/test:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-aefc9828fc72f25eee51b335dab430ab077f9fd0cd10aa80beb041f06e621d04: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_sft:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-79939762984cfb3c300a6bfdcccfa082f5dcff08ca0b0a57d400b0dc5c8a18b0: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/grpo_single:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-f7a62ea19aca03083e0efa70074db40736ef4a2d248329813548193b8b10a057: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.2:1b error="open /usr/share/ollama/.ollama/models/blobs/sha256-4f659a1e86d7f5a33c389f7991e7224b7ee6ad0358b53437d54c02d2e1b1118d: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mytest:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-3c866098594e4ec49cb7304dbd5d67b3b71b8c2d0225589d3d6e3d908caa52e8: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/llama3.2:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-34bb5ab01051a11372a91f95f3fbbc51173eed8e7f13ec395b9ae9b8bd0e242b: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.097+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/mistral:7b error="open /usr/share/ollama/.ollama/models/blobs/sha256-42347cd80dc868877d2807869c0e9c90034392b2f1f001cae1563488021e2e19: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.098+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/sft_test:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-77a2ccc8a0bedde7e1ce052ae894bc75c054e5b8594b514bcbe1e7ea6b85a1c7: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:58:35.098+08:00 level=WARN source=routes.go:920 msg="bad manifest filepath" name=registry.ollama.ai/library/smolgrpo_vuln-q4_k_m:latest error="open /usr/share/ollama/.ollama/models/blobs/sha256-98e18d220d09de6cc3277405c524d54670b48f0d6f0c25cda62d99b185d9b327: no such file or directory" Jun 01 13:58:35 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:35 | 200 | 2.020727ms | 127.0.0.1 | GET "/api/tags" Jun 01 13:58:50 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:58:50 | 200 | 16.377µs | 127.0.0.1 | HEAD "/" Jun 01 13:59:28 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:28 | 201 | 22.096522855s | 127.0.0.1 | POST "/api/blobs/sha256:a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970" Jun 01 13:59:28 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:28 | 200 | 162.958513ms | 127.0.0.1 | POST "/api/create" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 200 | 42.797µs | 127.0.0.1 | HEAD "/" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 200 | 26.341074ms | 127.0.0.1 | POST "/api/show" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.653+08:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 gpu=GPU-23b55d1c-f844-da92-7301-32ba511dffc8 parallel=2 available=23828299776 required="9.7 GiB" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.701+08:00 level=INFO source=server.go:135 msg="system memory" total="62.7 GiB" free="39.7 GiB" free_swap="0 B" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.701+08:00 level=INFO source=server.go:168 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[22.2 GiB]" memory.gpu_overhead="0 B" memory.required.full="9.7 GiB" memory.required.partial="9.7 GiB" memory.required.kv="1.0 GiB" memory.required.allocations="[9.7 GiB]" memory.weights.total="7.4 GiB" memory.weights.repeating="6.9 GiB" memory.weights.nonrepeating="532.3 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 (version GGUF V3 (latest)) Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 0: general.architecture str = llama Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 1: general.name str = model Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 2: llama.block_count u32 = 32 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 3: llama.context_length u32 = 131072 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 4: llama.embedding_length u32 = 4096 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 6: llama.attention.head_count u32 = 32 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 10: general.file_type u32 = 7 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 11: llama.vocab_size u32 = 128256 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 128000 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 128009 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 128004 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 20: tokenizer.ggml.add_bos_token bool = true Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - kv 21: general.quantization_version u32 = 2 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type f32: 65 tensors Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_loader: - type q8_0: 226 tensors Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file format = GGUF V3 (latest) Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file type = Q8_0 Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: print_info: file size = 7.95 GiB (8.50 BPW) Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: llama_model_load_from_file_impl: failed to load model Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: time=2025-06-01T13:59:38.728+08:00 level=INFO source=sched.go:455 msg="NewLlamaServer failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970 error="unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a254dfca168d6d2f571888cb52efd310f4116d9ecb11e8c3730e2924c8631970" Jun 01 13:59:38 i2r-spd-0030576 ollama[318199]: [GIN] 2025/06/01 - 13:59:38 | 500 | 164.725683ms | 127.0.0.1 | POST "/api/generate"

Sign in to join this conversation.

No Branch/Tag Specified

Branches Tags

main

hoyyeva/anthropic-local-image-path

dhiltgen/ci

dhiltgen/llama-runner

parth-remove-claude-desktop-launch

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

v0.30.0-rc3

v0.30.0-rc2

v0.30.0-rc1

v0.30.0-rc0

v0.23.1

v0.23.1-rc0

v0.23.0

v0.23.0-rc0

v0.22.1

v0.22.1-rc1

v0.22.1-rc0

v0.22.0

v0.22.0-rc1

v0.21.3-rc0

v0.21.2-rc1

v0.21.2

v0.21.2-rc0

v0.21.1

v0.21.1-rc1

v0.21.1-rc0

v0.21.0

v0.21.0-rc1

v0.21.0-rc0

v0.20.8-rc0

v0.20.7

v0.20.7-rc1

v0.20.7-rc0

v0.20.6

v0.20.6-rc1

v0.20.6-rc0

v0.20.5

v0.20.5-rc2

v0.20.5-rc1

v0.20.5-rc0

v0.20.4

v0.20.4-rc2

v0.20.4-rc1

v0.20.4-rc0

v0.20.3

v0.20.3-rc0

v0.20.2

v0.20.1

v0.20.1-rc2

v0.20.1-rc1

v0.20.1-rc0

v0.20.0

v0.20.0-rc1

v0.20.0-rc0

v0.19.0

v0.19.0-rc2

v0.19.0-rc1

v0.19.0-rc0

v0.18.4-rc1

v0.18.4-rc0

v0.18.3

v0.18.3-rc2

v0.18.3-rc1

v0.18.3-rc0

v0.18.2

v0.18.2-rc1

v0.18.2-rc0

v0.18.1

v0.18.1-rc1

v0.18.1-rc0

v0.18.0

v0.18.0-rc2

v0.18.0-rc1

v0.18.0-rc0

v0.17.8-rc4

v0.17.8-rc3

v0.17.8-rc2

v0.17.8-rc1

v0.17.8-rc0

v0.17.7

v0.17.7-rc2

v0.17.7-rc1

v0.17.7-rc0

v0.17.6

v0.17.5

v0.17.4

v0.17.3

v0.17.2

v0.17.1

v0.17.1-rc2

v0.17.1-rc1

v0.17.1-rc0

v0.17.0

v0.17.0-rc2

v0.17.0-rc1

v0.17.0-rc0

v0.16.3

v0.16.3-rc2

v0.16.3-rc1

v0.16.3-rc0

v0.16.2

v0.16.2-rc0

v0.16.1

v0.16.0

v0.16.0-rc2

v0.16.0-rc0

v0.16.0-rc1

v0.15.6

v0.15.5

v0.15.5-rc5

v0.15.5-rc4

v0.15.5-rc3

v0.15.5-rc2

v0.15.5-rc1

v0.15.5-rc0

v0.15.4

v0.15.3

v0.15.2

v0.15.1

v0.15.1-rc1

v0.15.1-rc0

v0.15.0-rc6

v0.15.0

v0.15.0-rc5

v0.15.0-rc4

v0.15.0-rc3

v0.15.0-rc2

v0.15.0-rc1

v0.15.0-rc0

v0.14.3

v0.14.3-rc3

v0.14.3-rc2

v0.14.3-rc1

v0.14.3-rc0

v0.14.2

v0.14.2-rc1

v0.14.2-rc0

v0.14.1

v0.14.0-rc11

v0.14.0

v0.14.0-rc10

v0.14.0-rc9

v0.14.0-rc8

v0.14.0-rc7

v0.14.0-rc6

v0.14.0-rc5

v0.14.0-rc4

v0.14.0-rc3

v0.14.0-rc2

v0.14.0-rc1

v0.14.0-rc0

v0.13.5

v0.13.5-rc1

v0.13.5-rc0

v0.13.4-rc2

v0.13.4

v0.13.4-rc1

v0.13.4-rc0

v0.13.3

v0.13.3-rc1

v0.13.3-rc0

v0.13.2

v0.13.2-rc2

v0.13.2-rc1

v0.13.2-rc0

v0.13.1

v0.13.1-rc2

v0.13.1-rc1

v0.13.1-rc0

v0.13.0

v0.13.0-rc0

v0.12.11

v0.12.11-rc1

v0.12.11-rc0

v0.12.10

v0.12.10-rc1

v0.12.10-rc0

v0.12.9-rc0

v0.12.9

v0.12.8

v0.12.8-rc0

v0.12.7

v0.12.7-rc1

v0.12.7-rc0

v0.12.7-citest0

v0.12.6

v0.12.6-rc1

v0.12.6-rc0

v0.12.5

v0.12.5-rc0

v0.12.4

v0.12.4-rc7

v0.12.4-rc6

v0.12.4-rc5

v0.12.4-rc4

v0.12.4-rc3

v0.12.4-rc2

v0.12.4-rc1

v0.12.4-rc0

v0.12.3

v0.12.2

v0.12.2-rc0

v0.12.1

v0.12.1-rc1

v0.12.1-rc2

v0.12.1-rc0

v0.12.0

v0.12.0-rc1

v0.12.0-rc0

v0.11.11

v0.11.11-rc3

v0.11.11-rc2

v0.11.11-rc1

v0.11.11-rc0

v0.11.10

v0.11.9

v0.11.9-rc0

v0.11.8

v0.11.8-rc0

v0.11.7-rc1

v0.11.7-rc0

v0.11.7

v0.11.6

v0.11.6-rc0

v0.11.5-rc4

v0.11.5-rc3

v0.11.5

v0.11.5-rc5

v0.11.5-rc2

v0.11.5-rc1

v0.11.5-rc0

v0.11.4

v0.11.4-rc0

v0.11.3

v0.11.3-rc0

v0.11.2

v0.11.1

v0.11.0-rc0

v0.11.0-rc1

v0.11.0-rc2

v0.11.0

v0.10.2-int1

v0.10.1

v0.10.0

v0.10.0-rc4

v0.10.0-rc3

v0.10.0-rc2

v0.10.0-rc1

v0.10.0-rc0

v0.9.7-rc1

v0.9.7-rc0

v0.9.6

v0.9.6-rc0

v0.9.6-ci0

v0.9.5

v0.9.4-rc5

v0.9.4-rc6

v0.9.4

v0.9.4-rc3

v0.9.4-rc4

v0.9.4-rc1

v0.9.4-rc2

v0.9.4-rc0

v0.9.3

v0.9.3-rc5

v0.9.4-citest0

v0.9.3-rc4

v0.9.3-rc3

v0.9.3-rc2

v0.9.3-rc1

v0.9.3-rc0

v0.9.2

v0.9.1

v0.9.1-rc1

v0.9.1-rc0

v0.9.1-ci1

v0.9.1-ci0

v0.9.0

v0.9.0-rc0

v0.8.0

v0.8.0-rc0

v0.7.1-rc2

v0.7.1

v0.7.1-rc1

v0.7.1-rc0

v0.7.0

v0.7.0-rc1

v0.7.0-rc0

v0.6.9-rc0

v0.6.8

v0.6.8-rc0

v0.6.7

v0.6.7-rc2

v0.6.7-rc1

v0.6.7-rc0

v0.6.6

v0.6.6-rc2

v0.6.6-rc1

v0.6.6-rc0

v0.6.5-rc1

v0.6.5

v0.6.5-rc0

v0.6.4-rc0

v0.6.4

v0.6.3-rc1

v0.6.3

v0.6.3-rc0

v0.6.2

v0.6.2-rc0

v0.6.1

v0.6.1-rc0

v0.6.0-rc0

v0.6.0

v0.5.14-rc0

v0.5.13

v0.5.13-rc6

v0.5.13-rc5

v0.5.13-rc4

v0.5.13-rc3

v0.5.13-rc2

v0.5.13-rc1

v0.5.13-rc0

v0.5.12

v0.5.12-rc1

v0.5.12-rc0

v0.5.11

v0.5.10

v0.5.9

v0.5.9-rc0

v0.5.8-rc13

v0.5.8

v0.5.8-rc12

v0.5.8-rc11

v0.5.8-rc10

v0.5.8-rc9

v0.5.8-rc8

v0.5.8-rc7

v0.5.8-rc6

v0.5.8-rc5

v0.5.8-rc4

v0.5.8-rc3

v0.5.8-rc2

v0.5.8-rc1

v0.5.8-rc0

v0.5.7

v0.5.6

v0.5.5

v0.5.5-rc0

v0.5.4

v0.5.3

v0.5.3-rc0

v0.5.2

v0.5.2-rc3

v0.5.2-rc2

v0.5.2-rc1

v0.5.2-rc0

v0.5.1

v0.5.0

v0.5.0-rc1

v0.4.8-rc0

v0.4.7

v0.4.6

v0.4.5

v0.4.4

v0.4.3

v0.4.3-rc0

v0.4.2

v0.4.2-rc1

v0.4.2-rc0

v0.4.1

v0.4.1-rc0

v0.4.0

v0.4.0-rc8

v0.4.0-rc7

v0.4.0-rc6

v0.4.0-rc5

v0.4.0-rc4

v0.4.0-rc3

v0.4.0-rc2

v0.4.0-rc1

v0.4.0-rc0

v0.4.0-ci3

v0.3.14

v0.3.14-rc0

v0.3.13

v0.3.12

v0.3.12-rc5

v0.3.12-rc4

v0.3.12-rc3

v0.3.12-rc2

v0.3.12-rc1

v0.3.11

v0.3.11-rc4

v0.3.11-rc3

v0.3.11-rc2

v0.3.11-rc1

v0.3.10

v0.3.10-rc1

v0.3.9

v0.3.8

v0.3.7

v0.3.7-rc6

v0.3.7-rc5

v0.3.7-rc4

v0.3.7-rc3

v0.3.7-rc2

v0.3.7-rc1

v0.3.6

v0.3.5

v0.3.4

v0.3.3

v0.3.2

v0.3.1

v0.3.0

v0.2.8

v0.2.8-rc2

v0.2.8-rc1

v0.2.7

v0.2.6

v0.2.5

v0.2.4

v0.2.3

v0.2.2

v0.2.2-rc2

v0.2.2-rc1

v0.2.1

v0.2.0

v0.1.49-rc14

v0.1.49-rc13

v0.1.49-rc12

v0.1.49-rc11

v0.1.49-rc10

v0.1.49-rc9

v0.1.49-rc8

v0.1.49-rc7

v0.1.49-rc6

v0.1.49-rc4

v0.1.49-rc5

v0.1.49-rc3

v0.1.49-rc2

v0.1.49-rc1

v0.1.48

v0.1.47

v0.1.46

v0.1.45-rc5

v0.1.45

v0.1.45-rc4

v0.1.45-rc3

v0.1.45-rc2

v0.1.45-rc1

v0.1.44

v0.1.43

v0.1.42

v0.1.41

v0.1.40

v0.1.40-rc1

v0.1.39

v0.1.39-rc2

v0.1.39-rc1

v0.1.38

v0.1.37

v0.1.36

v0.1.35

v0.1.35-rc1

v0.1.34

v0.1.34-rc1

v0.1.33

v0.1.33-rc7

v0.1.33-rc6

v0.1.33-rc5

v0.1.33-rc4

v0.1.33-rc3

v0.1.33-rc2

v0.1.33-rc1

v0.1.32

v0.1.32-rc2

v0.1.32-rc1

v0.1.31

v0.1.30

v0.1.29

v0.1.28

v0.1.27

v0.1.26

v0.1.25

v0.1.24

v0.1.23

v0.1.22

v0.1.21

v0.1.20

v0.1.19

v0.1.18

v0.1.17

v0.1.16

v0.1.15

v0.1.14

v0.1.13

v0.1.12

v0.1.11

v0.1.10

v0.1.9

v0.1.8

v0.1.7

v0.1.6

v0.1.5

v0.1.4

v0.1.3

v0.1.2

v0.1.1

v0.1.0

v0.0.21

v0.0.20

v0.0.19

v0.0.18

v0.0.17

v0.0.16

v0.0.15

v0.0.14

v0.0.13

v0.0.12

v0.0.11

v0.0.10

v0.0.9

v0.0.8

v0.0.7

v0.0.6

v0.0.5

v0.0.4

v0.0.3

v0.0.2

v0.0.1

Labels

Clear labels
amd

api

app

bug

build

cli

cloud

compatibility

context-length

create

docker

documentation

embeddings

feature request

feedback wanted

good first issue

gpt-oss

gpu

harmony

help wanted

image

install

intel

js

launch

linux

macos

memory

mlx

model

needs more info

networking

nvidia

ollama.com

performance

pull-request

Mirrored from GitHub Pull Request

python

question

registry

rendering

thinking

tools

top

vulkan

windows

wsl

No Label bug

Milestone

No items

No Milestone

Projects

Clear projects

No project

Assignees

Clear assignees

GiteaMirror ninjasurge

No Assignees

1 Participants

Notifications

Due Date
No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#69252