[GH-ISSUE #4136] [Feature] Rapid Modelfile Updates #64608

Closed
opened 2026-05-03 18:19:34 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @Arcitec on GitHub (May 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4136

Ollama is an absolutely brilliant project. Thank you everyone involved in creating it!

I've been working on local models, and noticed one weakness of Ollama. The initial import obviously has to take some time to convert the GGUF model weights into Ollama's native format. But after that, I need to tweak parameters, stop-words, temperature, template, etc, to perfect the model.

This is where Ollama falls apart a little bit.

First of all, I spent an hour scouring the documentation about how to update a locally created model. Finally, I figured out that if you want to update a locally created model, you have to run ollama create with the exact same parameters again.

It would be great to add a note about that to these three locations:

Anyway, when I finally figured out that you have to "create" the model again, I noticed that it's taking a very long time. 23 seconds to be precise. It reads the model files on disk again, converts them again, hashes them, and finally figures out that the hashes match the on-disk data (using already created layer sha256:...), and then it finally at long last updates the stored model to match the latest Modelfile contents.

I have two potential ideas for improvements.

  • Option 1: Add a flag to ollama create with the word --keep-weights or similar. This would just do an instantaneous update of the Modelfile parameters, while immediately reusing the latest, previously converted weights.
  • Option 2: Automatically detect changed weights, by tracking the file modification timestamp of local model files, and not doing any conversion again if the Modelfile's FROM file timestamp matches the same as the previously imported model metadata. This would be a very convenient solution and would avoid user-error (such as users accidentally importing a Modelfile which belongs to another model into a mismatched model name). Ollama could simply alert the user that it detected no changes to the weights, and giving the user an alert that they can run again with --convert-weights if they want to force weight-conversion again anyway (someone might want that?).

Anyway, I really hope that this can be improved in some way, because waiting half a minute after every tiny Modelfile parameter tweak, just to check the changes, is a very tedious process. Other than this, Ollama has been absolutely perfect! :)

Originally created by @Arcitec on GitHub (May 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4136 Ollama is an absolutely brilliant project. Thank you everyone involved in creating it! I've been working on local models, and noticed one weakness of Ollama. The initial import obviously has to take some time to convert the GGUF model weights into Ollama's native format. But after that, I need to tweak parameters, stop-words, temperature, template, etc, to perfect the model. This is where Ollama falls apart a little bit. First of all, I spent an hour scouring the documentation about how to update a locally created model. Finally, I figured out that if you want to update a locally created model, you have to run `ollama create` with the *exact same* parameters again. It would be great to add a note about that to these three locations: - https://github.com/ollama/ollama?tab=readme-ov-file#create-a-model - https://github.com/ollama/ollama/blob/main/docs/modelfile.md - https://github.com/ollama/ollama/blob/main/docs/import.md Anyway, when I finally figured out that you have to "create" the model again, I noticed that it's taking a very long time. 23 seconds to be precise. It reads the model files on disk again, converts them again, hashes them, and finally figures out that the hashes match the on-disk data (`using already created layer sha256:...`), and then it *finally* at long last updates the stored model to match the latest Modelfile contents. I have two potential ideas for improvements. - Option 1: Add a flag to `ollama create` with the word `--keep-weights` or similar. This would just do an instantaneous update of the Modelfile parameters, while immediately reusing the latest, previously converted weights. - Option 2: Automatically detect changed weights, by tracking the file modification timestamp of local model files, and not doing any conversion again if the Modelfile's `FROM` file timestamp matches the same as the previously imported model metadata. This would be a very convenient solution and would avoid user-error (such as users accidentally importing a Modelfile which belongs to another model into a mismatched model name). Ollama could simply alert the user that it detected no changes to the weights, and giving the user an alert that they can run again with `--convert-weights` if they want to force weight-conversion again anyway (someone might want that?). Anyway, I really hope that this can be improved in some way, because waiting half a minute after every tiny Modelfile parameter tweak, just to check the changes, is a very tedious process. Other than this, Ollama has been absolutely perfect! :)
GiteaMirror added the feature request label 2026-05-03 18:19:34 -05:00
Author
Owner

@Arcitec commented on GitHub (May 3, 2024):

Speaking of the documentation; the main README describes ollama pull as This command can also be used to update a local model. Only the diff will be pulled.

This is a bit misleading and is phrased as if it's usable for updating locally created models.

A clearer phrasing might be something like This command downloads models from a registry, and can also be used to update a previously downloaded model. Only the diff will be pulled. Not usable for locally created models.

I spent way too long trying to figure out why "pull" wasn't working for my local(ly created) model.

Another thing that might help is if the ollama pull command provides a clearer error message. Currently, it is very cryptic:

pulling manifest 
Error: pull model manifest: file does not exist

Clarifying that error message would go a long way to clear up the usage of pull. For example, a very simple tweak would be:

pulling manifest from registry
Error: pull model manifest: that model does not exist in the registry
<!-- gh-comment-id:2093641876 --> @Arcitec commented on GitHub (May 3, 2024): Speaking of the documentation; the main README describes `ollama pull` as `This command can also be used to update a local model. Only the diff will be pulled.` This is a bit misleading and is phrased as if it's usable for updating locally created models. A clearer phrasing might be something like `This command downloads models from a registry, and can also be used to update a previously downloaded model. Only the diff will be pulled. Not usable for locally created models.` I spent way too long trying to figure out why "pull" wasn't working for my local(ly created) model. Another thing that might help is if the `ollama pull` command provides a clearer error message. Currently, it is very cryptic: ``` pulling manifest Error: pull model manifest: file does not exist ``` Clarifying that error message would go a long way to clear up the usage of `pull`. For example, a very simple tweak would be: ``` pulling manifest from registry Error: pull model manifest: that model does not exist in the registry ```
Author
Owner

@FellowTraveler commented on GitHub (May 28, 2024):

It seems like even if I pull a 128k context model, it's actually a default of 2048 tokens context for all of them?
And only way around it is to create a custom modelfile, which COPIES the model?
These models are really really big. Is it really making a separate copy of the entire model just for me to customize the parameters to 16k instead of 2k ???

<!-- gh-comment-id:2134310457 --> @FellowTraveler commented on GitHub (May 28, 2024): It seems like even if I pull a 128k context model, it's actually a default of 2048 tokens context for all of them? And only way around it is to create a custom modelfile, which COPIES the model? These models are really really big. Is it really making a separate copy of the entire model just for me to customize the parameters to 16k instead of 2k ???
Author
Owner

@rick-github commented on GitHub (Jan 19, 2026):

It reads the model files on disk again, converts them again, hashes them

Use the model that was initially created as the basis for the update. For example, the first model was created with the following Modelfile:

FROM qwen3-27b.gguf
TEMPLATE "{ .Prompt }"
SYSTEM You are a helpful assistant

with ollama create my-model -f Modelfile. Now you want to update the model to change the size of the context window, so create a new Modelfile that references the original model:

FROM my-model
PARAMETER num_ctx 16384

Run ollama create my-model-c16k -f Modelfile and the new model my-model-c16k will reuse the existing sha256 blob. Alternatively running ollama create my-model -f Modelfile will replace the original model with the new one.

And only way around it is to create a custom modelfile, which COPIES the model?

No, the blobs are not copied, they are referenced from the manifest file of the new model.

<!-- gh-comment-id:3767602649 --> @rick-github commented on GitHub (Jan 19, 2026): > It reads the model files on disk again, converts them again, hashes them Use the model that was initially created as the basis for the update. For example, the first model was created with the following Modelfile: ```dockerfile FROM qwen3-27b.gguf TEMPLATE "{ .Prompt }" SYSTEM You are a helpful assistant ``` with `ollama create my-model -f Modelfile`. Now you want to update the model to change the size of the context window, so create a new Modelfile that references the original model: ```dockerfile FROM my-model PARAMETER num_ctx 16384 ``` Run `ollama create my-model-c16k -f Modelfile` and the new model `my-model-c16k` will reuse the existing sha256 blob. Alternatively running `ollama create my-model -f Modelfile` will replace the original model with the new one. > And only way around it is to create a custom modelfile, which COPIES the model? No, the blobs are not copied, they are referenced from the manifest file of the new model.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#64608