[GH-ISSUE #4341] how to import Meta-Llama-3-120B-Instruct.imatrix #2704

Closed
opened 2026-04-12 13:01:26 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @taozhiyuai on GitHub (May 11, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4341

WX20240511-115426@2x

I want to import this model. may I know how to import Meta-Llama-3-120B-Instruct.imatrix?

Originally created by @taozhiyuai on GitHub (May 11, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4341 ![WX20240511-115426@2x](https://github.com/ollama/ollama/assets/146583103/93328322-5cb6-48f8-a538-a99d03ce047e) I want to import this model. may I know how to import Meta-Llama-3-120B-Instruct.imatrix?
GiteaMirror added the feature request label 2026-04-12 13:01:26 -05:00
Author
Owner

@taozhiyuai commented on GitHub (May 11, 2024):

is it the right model file? or I should merge them into one GGUF?

FROM ./Meta-Llama-3-120B-Instruct-Q5_K_M-00001-of-00003.gguf
FROM ./Meta-Llama-3-120B-Instruct-Q5_K_M-00002-of-00003.gguf
FROM ./Meta-Llama-3-120B-Instruct-Q5_K_M-00003-of-00003.gguf
FROM ./Meta-Llama-3-120B-Instruct.imatrix

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""
PARAMETER num_keep 24
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"

<!-- gh-comment-id:2105538710 --> @taozhiyuai commented on GitHub (May 11, 2024): is it the right model file? or I should merge them into one GGUF? FROM ./Meta-Llama-3-120B-Instruct-Q5_K_M-00001-of-00003.gguf FROM ./Meta-Llama-3-120B-Instruct-Q5_K_M-00002-of-00003.gguf FROM ./Meta-Llama-3-120B-Instruct-Q5_K_M-00003-of-00003.gguf FROM ./Meta-Llama-3-120B-Instruct.imatrix TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|> {{ .Response }}<|eot_id|>""" PARAMETER num_keep 24 PARAMETER stop "<|start_header_id|>" PARAMETER stop "<|end_header_id|>" PARAMETER stop "<|eot_id|>"
Author
Owner

@DuckyBlender commented on GitHub (May 23, 2024):

You can merge them into one GGUF using llama.cpp, they have more info in the quantisation section in the README

<!-- gh-comment-id:2126781372 --> @DuckyBlender commented on GitHub (May 23, 2024): You can merge them into one GGUF using llama.cpp, they have more info in the quantisation section in the README
Author
Owner

@Milor123 commented on GitHub (Aug 28, 2024):

@DuckyBlender Need use it for use iMatrix? or just should ignore iMatrix and use q6 gguf version? I want import a model and it have a iMatrix but i dont know if should ignore it.
image
Is it bad?

<!-- gh-comment-id:2313866915 --> @Milor123 commented on GitHub (Aug 28, 2024): @DuckyBlender Need use it for use iMatrix? or just should ignore iMatrix and use q6 gguf version? I want import a model and it have a iMatrix but i dont know if should ignore it. ![image](https://github.com/user-attachments/assets/6bb669b7-a082-4e6f-9da2-ccae45c79e29) Is it bad?
Author
Owner

@DuckyBlender commented on GitHub (Aug 28, 2024):

Sorry, not familiar with I quants

<!-- gh-comment-id:2315232742 --> @DuckyBlender commented on GitHub (Aug 28, 2024): Sorry, not familiar with I quants
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2704