[GH-ISSUE #4600] Fast-copy files on ollama create if accessible #49399

Open
opened 2026-04-28 11:40:36 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @spott on GitHub (May 24, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4600

ollama create given a model file which references a local .gguf file currently does a regular copy of the .gguf file.

A copy on write would save disk space and be faster, allowing people to have gguf files in multiple places without paying the disk space penalty.

This is only possible on a few different file systems. APFS is one (on all modern Macs), Btrfs, zfs, etc.

Changing cp to cp -c should do it on Macs. cp --reflink=auto should work on linux.

Originally created by @spott on GitHub (May 24, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4600 `ollama create` given a model file which references a local .gguf file currently does a regular copy of the .gguf file. A copy on write would save disk space and be faster, allowing people to have gguf files in multiple places without paying the disk space penalty. This is only possible on a few different file systems. APFS is one (on all modern Macs), Btrfs, zfs, etc. Changing `cp` to `cp -c` should do it on Macs. `cp --reflink=auto` should work on linux.
GiteaMirror added the feature request label 2026-04-28 11:40:36 -05:00
Author
Owner

@spott commented on GitHub (May 24, 2024):

I would be happy to try and implement this, however I'm not sure where to find this part of the code base. If someone could point me in the right direction, I would appreciate it.

<!-- gh-comment-id:2128375356 --> @spott commented on GitHub (May 24, 2024): I would be happy to try and implement this, however I'm not sure where to find this part of the code base. If someone could point me in the right direction, I would appreciate it.
Author
Owner

@easp commented on GitHub (May 27, 2024):

Ollama is client-server over HTTP. ollama create sends data over http to server. Server might reside on another machine from client, run as a different user with different permissions, or model storage might be on another filesystem. So, file system copy isn't used and wouldn't be general enough to cover all the use cases.

<!-- gh-comment-id:2133891817 --> @easp commented on GitHub (May 27, 2024): Ollama is client-server over HTTP. `ollama create` sends data over http to server. Server might reside on another machine from client, run as a different user with different permissions, or model storage might be on another filesystem. So, file system copy isn't used and wouldn't be general enough to cover all the use cases.
Author
Owner

@spott commented on GitHub (May 29, 2024):

I imagine the vast majority of users using ollama use it locally (though maybe I'm wrong about this, regardless I imagine it is a lot of users), so accepting a file path if it is known that the server is local would be an option.

Another option is to create a special "server side create" script that could be run.

So, file system copy isn't used and wouldn't be general enough to cover all the use cases.

Yes, this would be a special case optimization to drastically speed up creating new models and to save users gigabytes of disk space. It would only work on specific filesystems anyways. It isn't going to change the world, but it would be a welcome change that would help people.

<!-- gh-comment-id:2138231695 --> @spott commented on GitHub (May 29, 2024): I imagine the vast majority of users using ollama use it locally (though maybe I'm wrong about this, regardless I imagine it is a lot of users), so accepting a file path if it is known that the server is local would be an option. Another option is to create a special "server side create" script that could be run. > So, file system copy isn't used and wouldn't be general enough to cover all the use cases. Yes, this would be a special case optimization to drastically speed up creating new models and to save users gigabytes of disk space. It would only work on specific filesystems anyways. It isn't going to change the world, but it would be a welcome change that would help people.
Author
Owner

@lee-b commented on GitHub (Apr 21, 2025):

Personally, I have set up:

/models -- a GGUF directory on my server
/models -- a mount of that in my docker container

It would be really MUCH better if ollama simply copied the modelfile and hashed the existing image in this kind of setup, not copying/pulling the FROM file in cases where that FROM file already exists on the server.

With most other inference services/engines, this works fine, with zero-copy: I mount the folder, and tell the service/engine to run the file. When large model files are involved, copying becomes an issue in terms of time and disk space. Especially when experimenting with modelfile config etc.

<!-- gh-comment-id:2818043405 --> @lee-b commented on GitHub (Apr 21, 2025): Personally, I have set up: /models -- a GGUF directory on my server /models -- a mount of that in my docker container It would be really MUCH better if ollama simply copied the modelfile and hashed the existing image in this kind of setup, not copying/pulling the FROM file in cases where that FROM file already exists on the server. With most other inference services/engines, this works fine, with zero-copy: I mount the folder, and tell the service/engine to run the file. When large model files are involved, copying becomes an issue in terms of time and disk space. Especially when experimenting with modelfile config etc.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#49399