[GH-ISSUE #1450] Use hard link to import GGUF on the same host to save disk space #776

Closed
opened 2026-04-12 10:27:28 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @shyeetsao on GitHub (Dec 10, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1450

If I understand it correctly, first step of a GGUF import is copying the binary to model dir with a hashed name. When the number of models (mainly GGUF) grows, duplicated binaries may take a lot of disk space.
Thinking hard links, or the raw GGUFs if possible, will do the work of space saving, though it only makes sense when client & server are on the same host and GGUF & model dir are on the same disk.

Originally created by @shyeetsao on GitHub (Dec 10, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1450 If I understand it correctly, first step of a GGUF import is copying the binary to model dir with a hashed name. When the number of models (mainly GGUF) grows, duplicated binaries may take a lot of disk space. Thinking hard links, or the raw GGUFs if possible, will do the work of space saving, though it only makes sense when client & server are on the same host and GGUF & model dir are on the same disk.
Author
Owner

@technovangelist commented on GitHub (Dec 10, 2023):

Thanks for the comment. There is no duplication of files. If you have something like llama2 and then two models that are based on llama2 with different prompts and parameters, you will only have a single copy of the llama2 weights file. The other 2 leverage that file.

<!-- gh-comment-id:1849020205 --> @technovangelist commented on GitHub (Dec 10, 2023): Thanks for the comment. There is no duplication of files. If you have something like llama2 and then two models that are based on llama2 with different prompts and parameters, you will only have a single copy of the llama2 weights file. The other 2 leverage that file.
Author
Owner

@shyeetsao commented on GitHub (Dec 11, 2023):

Hi @technovangelist thanks for your clarification. Good to know model files don't duplicate when you create a new Ollama model, maybe with different prompt, from an existing Ollama one. However what I was suggesting is that the raw GGUF files, which are built by llama.cpp or downloaded from HF and haven't been imported as Ollama models, get duplicated during imports. Tell me if there is anything unclear.

<!-- gh-comment-id:1849205321 --> @shyeetsao commented on GitHub (Dec 11, 2023): Hi @technovangelist thanks for your clarification. Good to know model files don't duplicate when you create a new Ollama model, maybe with different prompt, from an existing Ollama one. However what I was suggesting is that the raw GGUF files, which are built by llama.cpp or downloaded from HF and haven't been imported as Ollama models, get duplicated during [imports](https://github.com/jmorganca/ollama/blob/main/docs/import.md#importing-gguf). Tell me if there is anything unclear.
Author
Owner

@phalexo commented on GitHub (Dec 11, 2023):

Hi @technovangelist thanks for your clarification. Good to know model files don't duplicate when you create a new Ollama model, maybe with different prompt, from an existing Ollama one. However what I was suggesting is that the raw GGUF files, which are built by llama.cpp or downloaded from HF and haven't been imported as Ollama models, get duplicated during imports. Tell me if there is anything unclear.

Yes, I have been meaning to ask about it too. When one creates a model, it appears that the model weights get duplicated into ollama repo, if simply judging by how long it takes. Simply writing some meta data would not take this long.

<!-- gh-comment-id:1850178931 --> @phalexo commented on GitHub (Dec 11, 2023): > Hi @technovangelist thanks for your clarification. Good to know model files don't duplicate when you create a new Ollama model, maybe with different prompt, from an existing Ollama one. However what I was suggesting is that the raw GGUF files, which are built by llama.cpp or downloaded from HF and haven't been imported as Ollama models, get duplicated during [imports](https://github.com/jmorganca/ollama/blob/main/docs/import.md#importing-gguf). Tell me if there is anything unclear. Yes, I have been meaning to ask about it too. When one creates a model, it appears that the model weights get duplicated into ollama repo, if simply judging by how long it takes. Simply writing some meta data would not take this long.
Author
Owner

@mxyng commented on GitHub (Dec 11, 2023):

Yes, importing binary model weights requires copying the file. This is necessary because the server might be remote or in another context that makes the source file inaccessible.

<!-- gh-comment-id:1850544385 --> @mxyng commented on GitHub (Dec 11, 2023): Yes, importing binary model weights requires copying the file. This is necessary because the server might be remote or in another context that makes the source file inaccessible.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#776