[GH-ISSUE #5744] Model Cold Storage and user manual management possibility #29337

Open
opened 2026-04-22 08:06:26 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @nikhil-swamix on GitHub (Jul 17, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5744

image

model management

its nearly impossible to manage models by manual method, and it generates hash values,

what i was trying to do was to move some models to cold storage, ie HDD, and some to SSD. but couldnt find a way rather than full repo movement , and im faced with this

image

its take lot of time to do this type of management due to sheer volume of transfer, and why does one model generate 100s of blobs? cant it be stored to a folder per model rather than littering everywhere? my best bet is to check date modified time and perform the job.

proposed

ollama archive <model_name> <Disk_or_path>

ollama pull <Disk_or_path> will show options which models to revive to cache.

urgent request

observations

some users reported that ollama pull takes long time #2850 #5361 etc, i suspect the nature of SSDs to avoid creation of huge reserved space, as first few seconds in task manager it shown 1GB/s , then fall to 200, the pathetic 5MB/s. it could be the write protection mechanism. maybe chunked download and merge? or use huggingface like loader like _part001 _part002 for layers loading?

@bmizerany @drnic @anaisbetts @sqs @lstep

Originally created by @nikhil-swamix on GitHub (Jul 17, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5744 ![image](https://github.com/user-attachments/assets/5cdcde7e-1b1e-4d62-af03-a925751d1b1d) # model management its nearly impossible to manage models by manual method, and it generates hash values, what i was trying to do was to move some models to cold storage, ie HDD, and some to SSD. but couldnt find a way rather than full repo movement , and im faced with this ![image](https://github.com/user-attachments/assets/1483f7ff-8dd3-4d2a-a441-372957bd1509) its take lot of time to do this type of management due to sheer volume of transfer, and why does one model generate 100s of blobs? cant it be stored to a folder per model rather than littering everywhere? my best bet is to check date modified time and perform the job. # proposed ` ollama archive <model_name> <Disk_or_path>` `ollama pull <Disk_or_path>` will show options which models to revive to cache. # urgent request # observations some users reported that ollama pull takes long time #2850 #5361 etc, i suspect the nature of SSDs to avoid creation of huge reserved space, as first few seconds in task manager it shown 1GB/s , then fall to 200, the pathetic 5MB/s. it could be the write protection mechanism. maybe chunked download and merge? or use huggingface like loader like _part001 _part002 for layers loading? @bmizerany @drnic @anaisbetts @sqs @lstep
GiteaMirror added the feature request label 2026-04-22 08:06:26 -05:00
Author
Owner

@anaisbetts commented on GitHub (Jul 17, 2024):

@nikhil-swamix Please do not CC random people in GitHub issues. I am not a maintainer on this project.

<!-- gh-comment-id:2233381515 --> @anaisbetts commented on GitHub (Jul 17, 2024): @nikhil-swamix Please do not CC random people in GitHub issues. I am not a maintainer on this project.
Author
Owner

@rick-github commented on GitHub (Jul 17, 2024):

A model shouldn't generate 100s of blobs. Do the files have a suffix? That might indicate a failed download, re-pulling the model might help.

All of the blobs that make up a model are listed in the model manifest, a simple shell script would be able to use that information to manage the blobs (eg, copy to a new location and symlink back to to the blob store).

I have noticed the transfer slow down you mention, it looks to me like that traffic is being rate limited at the source. I find that if I stop and restart the pull, the download rate saturates my link.

<!-- gh-comment-id:2233472460 --> @rick-github commented on GitHub (Jul 17, 2024): A model shouldn't generate 100s of blobs. Do the files have a suffix? That might indicate a failed download, re-pulling the model might help. All of the blobs that make up a model are listed in the model manifest, a simple shell script would be able to use that information to manage the blobs (eg, copy to a new location and symlink back to to the blob store). I have noticed the transfer slow down you mention, it looks to me like that traffic is being rate limited at the source. I find that if I stop and restart the pull, the download rate saturates my link.
Author
Owner

@nikhil-swamix commented on GitHub (Aug 31, 2024):

@nikhil-swamix Please do not CC random people in GitHub issues. I am not a maintainer on this project.

sorry about it, your name showed as default @ which usually active users come in that list. any inconvenience is regretted.

<!-- gh-comment-id:2322824119 --> @nikhil-swamix commented on GitHub (Aug 31, 2024): > @nikhil-swamix Please do not CC random people in GitHub issues. I am not a maintainer on this project. sorry about it, your name showed as default @ which usually active users come in that list. any inconvenience is regretted.
Author
Owner

@nikhil-swamix commented on GitHub (Aug 31, 2024):

A model shouldn't generate 100s of blobs. Do the files have a suffix? That might indicate a failed download, re-pulling the model might help.

All of the blobs that make up a model are listed in the model manifest, a simple shell script would be able to use that information to manage the blobs (eg, copy to a new location and symlink back to to the blob store).

I have noticed the transfer slow down you mention, it looks to me like that traffic is being rate limited at the source. I find that if I stop and restart the pull, the download rate saturates my link.

yes, after re pull they vanish. the bottleneck is for allocating space on drive, and upgraded connection to 125mbps, somewhat better, + was actually dusty connection causing increased HW failure while I/O and cleaned connector with WD40. LOL.

currently i do the Hot Cache to fastest storage available via this command

Move-Item "F:\Models\ollama\blobs\sha256-5ff0abeeac1d2dbdd5455c0b49ba3b29a9ce3c1fb181b2eef2e948689d55d046" "D:\Models\ollama\blobs\"
New-Item -ItemType SymbolicLink -Path "F:\Models\ollama\blobs\sha256-5ff0abeeac1d2dbdd5455c0b49ba3b29a9ce3c1fb181b2eef2e948689d55d046" -Target "D:\Models\ollama\blobs\sha256-5ff0abeeac1d2dbdd5455c0b49ba3b29a9ce3c1fb181b2eef2e948689d55d046"

D drive is M2 SSD.
Im on powershell windows, and the reason being the M2 is 1.5 GB/s, which loads in few seconds, but hdd is 200MBps, which takes about a minute, and ollama having default timeout of 5 mins does no justice to continue extensions, where i have to manually curl "keep_alive:-1" or set env variables for timeout. regardless it will be a great QOL improvement to have managable storage locations. i can do the above commands per model basis, but identifying the hash is difficult, and have to check model manifests if multiple 7B models with similar sizes are present... is identifying models blobs using sizes is not feasible in file explorer.

<!-- gh-comment-id:2322825460 --> @nikhil-swamix commented on GitHub (Aug 31, 2024): > A model shouldn't generate 100s of blobs. Do the files have a suffix? That might indicate a failed download, re-pulling the model might help. > > All of the blobs that make up a model are listed in the model manifest, a simple shell script would be able to use that information to manage the blobs (eg, copy to a new location and symlink back to to the blob store). > > I have noticed the transfer slow down you mention, it looks to me like that traffic is being rate limited at the source. I find that if I stop and restart the pull, the download rate saturates my link. yes, after re pull they vanish. the bottleneck is for allocating space on drive, and upgraded connection to 125mbps, somewhat better, + was actually dusty connection causing increased HW failure while I/O and cleaned connector with WD40. LOL. currently i do the Hot Cache to fastest storage available via this command ```ps Move-Item "F:\Models\ollama\blobs\sha256-5ff0abeeac1d2dbdd5455c0b49ba3b29a9ce3c1fb181b2eef2e948689d55d046" "D:\Models\ollama\blobs\" ``` ```ps New-Item -ItemType SymbolicLink -Path "F:\Models\ollama\blobs\sha256-5ff0abeeac1d2dbdd5455c0b49ba3b29a9ce3c1fb181b2eef2e948689d55d046" -Target "D:\Models\ollama\blobs\sha256-5ff0abeeac1d2dbdd5455c0b49ba3b29a9ce3c1fb181b2eef2e948689d55d046" ``` D drive is M2 SSD. Im on powershell windows, and the reason being the M2 is 1.5 GB/s, which loads in few seconds, but hdd is 200MBps, which takes about a minute, and ollama having default timeout of 5 mins does no justice to continue extensions, where i have to manually curl "keep_alive:-1" or set env variables for timeout. regardless it will be a great QOL improvement to have managable storage locations. i can do the above commands per model basis, but identifying the hash is difficult, and have to check model manifests if multiple 7B models with similar sizes are present... is identifying models blobs using sizes is not feasible in file explorer.
Author
Owner

@nikhil-swamix commented on GitHub (Aug 31, 2024):

image
as i understand you are active here and have greater understanding of this project,
@mxyng @dhiltgen , if this is not planned please closed the issue, i will try to accomplish using some tkinter python script to manage model storage.

<!-- gh-comment-id:2322827992 --> @nikhil-swamix commented on GitHub (Aug 31, 2024): ![image](https://github.com/user-attachments/assets/527054a0-719d-4cf0-8a8d-0b0fc9a13b18) as i understand you are active here and have greater understanding of this project, @mxyng @dhiltgen , if this is not planned please closed the issue, i will try to accomplish using some tkinter python script to manage model storage.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#29337