[GH-ISSUE #4199] support llama 3 Moe #64650

Closed
opened 2026-05-03 18:26:02 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @taozhiyuai on GitHub (May 6, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4199

please support

QuantFactory/Meta-Llama-3-120B-Instruct-GGUF

raincandy-u/Llama-3-Aplite-Instruct-4x8B-GGUF-MoE

Originally created by @taozhiyuai on GitHub (May 6, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4199 please support QuantFactory/Meta-Llama-3-120B-Instruct-GGUF raincandy-u/Llama-3-Aplite-Instruct-4x8B-GGUF-MoE
GiteaMirror added the model label 2026-05-03 18:26:02 -05:00
Author
Owner

@taozhiyuai commented on GitHub (May 6, 2024):

Last login: Tue May 7 07:23:10 on ttys001
taozhiyu@603e5f4a42f1 Q5KM % ollama create meta-llama-3-120b-instruct:q5_k_m -f modelfile
transferring model data
creating model layer
creating template layer
creating parameters layer
creating config layer
writing layer sha256:32ddfccf2cbe7c957fe33adf0982774b52757829e09523413177bd1246e0bd77
writing layer sha256:ca22cf90181bdb44001cc37be8ec8f1a36a40baf13a57575d55132c39a597a89
using already created layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f
writing layer sha256:f8464d7be52526f12ceb6b16b9187955f8dba5961aafc0be3ee2344e08f6e814
writing layer sha256:4ae5e46f5750be92c4885d1c5122a343bec38da83c380562ea85b83e18ca1952
writing manifest
success
taozhiyu@603e5f4a42f1 Q5KM % ollama list
NAME ID SIZE MODIFIED
meta-llama-3-120b-instruct:q5_k_m 05b8e7ef8076 86 GB 9 seconds ago
taozhiyu@603e5f4a42f1 Q5KM % ollama run meta-llama-3-120b-instruct:q5_k_m
Error: Post "http://127.0.0.1:11434/api/chat": EOF
taozhiyu@603e5f4a42f1 Q5KM %

<!-- gh-comment-id:2097097296 --> @taozhiyuai commented on GitHub (May 6, 2024): Last login: Tue May 7 07:23:10 on ttys001 taozhiyu@603e5f4a42f1 Q5KM % ollama create meta-llama-3-120b-instruct:q5_k_m -f modelfile transferring model data creating model layer creating template layer creating parameters layer creating config layer writing layer sha256:32ddfccf2cbe7c957fe33adf0982774b52757829e09523413177bd1246e0bd77 writing layer sha256:ca22cf90181bdb44001cc37be8ec8f1a36a40baf13a57575d55132c39a597a89 using already created layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f writing layer sha256:f8464d7be52526f12ceb6b16b9187955f8dba5961aafc0be3ee2344e08f6e814 writing layer sha256:4ae5e46f5750be92c4885d1c5122a343bec38da83c380562ea85b83e18ca1952 writing manifest success taozhiyu@603e5f4a42f1 Q5KM % ollama list NAME ID SIZE MODIFIED meta-llama-3-120b-instruct:q5_k_m 05b8e7ef8076 86 GB 9 seconds ago taozhiyu@603e5f4a42f1 Q5KM % ollama run meta-llama-3-120b-instruct:q5_k_m Error: Post "http://127.0.0.1:11434/api/chat": EOF taozhiyu@603e5f4a42f1 Q5KM %
Author
Owner

@taozhiyuai commented on GitHub (May 6, 2024):

FROM /Users/taozhiyu/Downloads/M-GGUF/Meta-Llama-3-120B-Instruct-GGUF/Q5KM/Meta-Llama-3-120B-Instruct.Q5_K_M.gguf

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""
PARAMETER num_keep 24
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"ol

<!-- gh-comment-id:2097097521 --> @taozhiyuai commented on GitHub (May 6, 2024): FROM /Users/taozhiyu/Downloads/M-GGUF/Meta-Llama-3-120B-Instruct-GGUF/Q5KM/Meta-Llama-3-120B-Instruct.Q5_K_M.gguf TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|> {{ .Response }}<|eot_id|>""" PARAMETER num_keep 24 PARAMETER stop "<|start_header_id|>" PARAMETER stop "<|end_header_id|>" PARAMETER stop "<|eot_id|>"ol
Author
Owner

@taozhiyuai commented on GitHub (May 6, 2024):

anyone know how to solve this issue?

<!-- gh-comment-id:2097097845 --> @taozhiyuai commented on GitHub (May 6, 2024): anyone know how to solve this issue?
Author
Owner

@rick-github commented on GitHub (Jan 19, 2026):

Seems to work now.

$ ollama -v
ollama version is 0.14.2

$ ollama run meta-llama-3-120b-instruct:q5_k_m 
>>> hello
Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat
about something in particular? I'm all ears (or rather, all text).

$ ollama run hf.co/raincandy-u/Llama-3-Aplite-Instruct-4x8B-GGUF-MoE:IQ4_XS
>>> hello
Hi there! I'm a friendly AI assistant. How are you doing today? Would you like to chat about
something in particular or just have a casual conversation? Let me know and I'll do my best to help!
<!-- gh-comment-id:3767711466 --> @rick-github commented on GitHub (Jan 19, 2026): Seems to work now. ```console $ ollama -v ollama version is 0.14.2 $ ollama run meta-llama-3-120b-instruct:q5_k_m >>> hello Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat about something in particular? I'm all ears (or rather, all text). $ ollama run hf.co/raincandy-u/Llama-3-Aplite-Instruct-4x8B-GGUF-MoE:IQ4_XS >>> hello Hi there! I'm a friendly AI assistant. How are you doing today? Would you like to chat about something in particular or just have a casual conversation? Let me know and I'll do my best to help! ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#64650