[GH-ISSUE #193] Ability to download LLAMA2 70b #62116

Closed
opened 2026-05-03 07:35:45 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @plannaAlain on GitHub (Jul 24, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/193

Originally created by @plannaAlain on GitHub (Jul 24, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/193
GiteaMirror added the modelfeature request labels 2026-05-03 07:35:53 -05:00
Author
Owner

@XBeg9 commented on GitHub (Jul 28, 2023):

Any workarounds to download 70b model?

<!-- gh-comment-id:1656023491 --> @XBeg9 commented on GitHub (Jul 28, 2023): Any workarounds to download 70b model?
Author
Owner

@BruceMacD commented on GitHub (Jul 31, 2023):

We will add this, as a workaround in the meantime you could try downloading a binary file from huggingface and running it with a Modelfile.

(this is untested, because I can't run a 70B model at the moment)

  1. Download a 70B binary (ex: llama-2-70b.ggmlv3.q4_0.bin)

https://huggingface.co/TheBloke/Llama-2-70B-GGML/tree/main

  1. Create a Modelfile:
FROM ./BIN_FILE_LOCATION

TEMPLATE """
{{- if .First }}
<<SYS>>
{{ .System }}
<</SYS>>
{{- end }}

[INST] {{ .Prompt }} [/INST]
"""

SYSTEM """
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
"""
  1. In a terminal run: ollama create NAME -f ./Modelfile

  2. ollama run NAME

<!-- gh-comment-id:1658439616 --> @BruceMacD commented on GitHub (Jul 31, 2023): We will add this, as a workaround in the meantime you could try downloading a binary file from huggingface and running it with a Modelfile. (this is untested, because I can't run a 70B model at the moment) 1. Download a 70B binary (ex: [llama-2-70b.ggmlv3.q4_0.bin](https://huggingface.co/TheBloke/Llama-2-70B-GGML/blob/main/llama-2-70b.ggmlv3.q4_0.bin)) https://huggingface.co/TheBloke/Llama-2-70B-GGML/tree/main 2. Create a Modelfile: ``` FROM ./BIN_FILE_LOCATION TEMPLATE """ {{- if .First }} <<SYS>> {{ .System }} <</SYS>> {{- end }} [INST] {{ .Prompt }} [/INST] """ SYSTEM """ You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information. """ ``` 3. In a terminal run: `ollama create NAME -f ./Modelfile` 4. `ollama run NAME`
Author
Owner

@jvence commented on GitHub (Aug 1, 2023):

Tried following @BruceMacD 's instruction but I get an error saying:
>>> hello Error: failed to load model

<!-- gh-comment-id:1660038533 --> @jvence commented on GitHub (Aug 1, 2023): Tried following @BruceMacD 's instruction but I get an error saying: `>>> hello Error: failed to load model`
Author
Owner

@burggraf commented on GitHub (Aug 1, 2023):

I tried running ollama run llama2:70b and while it did seem to download successfully, I also got the same error: Error: failed to load model. Note: I have an M1 Max with 64GB memory.

<!-- gh-comment-id:1660929016 --> @burggraf commented on GitHub (Aug 1, 2023): I tried running `ollama run llama2:70b` and while it did seem to download successfully, I also got the same error: `Error: failed to load model`. Note: I have an M1 Max with 64GB memory.
Author
Owner

@jsdtaylor commented on GitHub (Aug 2, 2023):

I tried running ollama run llama2:70b and while it did seem to download successfully, I also got the same error: Error: failed to load model. Note: I have an M1 Max with 64GB memory.

Same issue on an M1

<!-- gh-comment-id:1662092223 --> @jsdtaylor commented on GitHub (Aug 2, 2023): > I tried running `ollama run llama2:70b` and while it did seem to download successfully, I also got the same error: `Error: failed to load model`. Note: I have an M1 Max with 64GB memory. Same issue on an M1
Author
Owner

@mxyng commented on GitHub (Aug 2, 2023):

We're still working on uploading llama2-70B. In the meantime, there are currently quirks with this model that requires additional parameters to be set. In your Modelfile, add a line PARAMETER num_gqa 8. Make sure to update to 0.0.13 as there are changes in that release required for running the 70B model.

Updating @BruceMacD's example Modelfile:

FROM ./BIN_FILE_LOCATION

PARAMETER num_gqa 8

TEMPLATE """
{{- if .First }}
<<SYS>>
{{ .System }}
<</SYS>>
{{- end }}

[INST] {{ .Prompt }} [/INST]
"""

SYSTEM """
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
"""
<!-- gh-comment-id:1662792687 --> @mxyng commented on GitHub (Aug 2, 2023): We're still working on uploading llama2-70B. In the meantime, there are currently quirks with this model that requires additional parameters to be set. In your Modelfile, add a line `PARAMETER num_gqa 8`. Make sure to update to 0.0.13 as there are changes in that release required for running the 70B model. Updating @BruceMacD's example Modelfile: ``` FROM ./BIN_FILE_LOCATION PARAMETER num_gqa 8 TEMPLATE """ {{- if .First }} <<SYS>> {{ .System }} <</SYS>> {{- end }} [INST] {{ .Prompt }} [/INST] """ SYSTEM """ You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information. """ ```
Author
Owner

@mxyng commented on GitHub (Aug 4, 2023):

The 70b chat and non-chat models are available for download under the tags llama2:70b-chat-q4_0 and llama2:70b-q4_0. Other quantization levels such as K-quants are available as well. Here's a full list of the available models:

llama2:70b-chat-q3_K_L
llama2:70b-chat-q3_K_M
llama2:70b-chat-q3_K_S
llama2:70b-chat-q4_0
llama2:70b-chat-q4_1
llama2:70b-chat-q4_K_M
llama2:70b-chat-q4_K_S
llama2:70b-chat-q5_K_M
llama2:70b-chat-q5_K_S
llama2:70b-q2_K
llama2:70b-q3_K_L
llama2:70b-q3_K_M
llama2:70b-q3_K_S
llama2:70b-q4_0
llama2:70b-q4_1
llama2:70b-q4_K_M
llama2:70b-q4_K_S
llama2:70b-q5_K_M
<!-- gh-comment-id:1666116499 --> @mxyng commented on GitHub (Aug 4, 2023): The 70b chat and non-chat models are available for download under the tags `llama2:70b-chat-q4_0` and `llama2:70b-q4_0`. Other quantization levels such as K-quants are available as well. Here's a full list of the available models: ``` llama2:70b-chat-q3_K_L llama2:70b-chat-q3_K_M llama2:70b-chat-q3_K_S llama2:70b-chat-q4_0 llama2:70b-chat-q4_1 llama2:70b-chat-q4_K_M llama2:70b-chat-q4_K_S llama2:70b-chat-q5_K_M llama2:70b-chat-q5_K_S llama2:70b-q2_K llama2:70b-q3_K_L llama2:70b-q3_K_M llama2:70b-q3_K_S llama2:70b-q4_0 llama2:70b-q4_1 llama2:70b-q4_K_M llama2:70b-q4_K_S llama2:70b-q5_K_M ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#62116