[GH-ISSUE #10207] API Default Model #68752

Closed
opened 2026-05-04 15:04:39 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @j820301 on GitHub (Apr 10, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10207

This idea might be a bit far-fetched, but I think it could be a good function.
When I provide the Ollama API to others, they must specify the model (required), for example, llama3:70b.
I'm wondering if there are default or implicit model parameters that can automatically apply my provided Ollama default model when accessing the API, so users don't need to specifically specify : but instead access it based on my provided model.

For example: If the API I provide and set the model to mistral:latest, other users can simply use model:main with the default model environment to access the model I specified. This way, if I change to other models in the future, other users can still use the same model parameters – model:main – to access it.

Is there a correct way to set this up? I know I can use ollama create default -f Modelfile, but I'm hoping I don't have to recreate a model each time just to change the name. If there's a simple and direct way to do this, I think it would be great.

Thank you again to the Ollama developers for their dedication.

Originally created by @j820301 on GitHub (Apr 10, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10207 This idea might be a bit far-fetched, but I think it could be a good function. When I provide the Ollama API to others, they must specify the model (required), for example, llama3:70b. I'm wondering if there are default or implicit model parameters that can automatically apply my provided Ollama default model when accessing the API, so users don't need to specifically specify <model>:<tag> but instead access it based on my provided model. For example: If the API I provide and set the model to mistral:latest, other users can simply use model:main with the default model environment to access the model I specified. This way, if I change to other models in the future, other users can still use the same model parameters – model:main – to access it. Is there a correct way to set this up? I know I can use ollama create default -f Modelfile, but I'm hoping I don't have to recreate a model each time just to change the name. If there's a simple and direct way to do this, I think it would be great. Thank you again to the Ollama developers for their dedication.
GiteaMirror added the feature request label 2026-05-04 15:04:39 -05:00
Author
Owner

@rick-github commented on GitHub (Apr 10, 2025):

There's currently no way to set a default model in ollama, a model needs to be created.

$ ollama cp mistral:latest model:main

If you want to tweak model parameters, for example to change the temperature, then you have to use the Modelfile approach:

$ echo FROM mistral:latest > Modelfile
$ echo PARAMETER temperature 0.5 >> Modelfile
$ ollama create model:main

Note that this is cheap to do: it's quick and it doesn't duplicate the model weights so the only extra storage is for the manifest file and the parameters blob, less than a kilobyte.

<!-- gh-comment-id:2792393851 --> @rick-github commented on GitHub (Apr 10, 2025): There's currently no way to set a default model in ollama, a model needs to be created. ```console $ ollama cp mistral:latest model:main ``` If you want to tweak model parameters, for example to change the temperature, then you have to use the Modelfile approach: ```console $ echo FROM mistral:latest > Modelfile $ echo PARAMETER temperature 0.5 >> Modelfile $ ollama create model:main ``` Note that this is cheap to do: it's quick and it doesn't duplicate the model weights so the only extra storage is for the manifest file and the parameters blob, less than a kilobyte.
Author
Owner

@j820301 commented on GitHub (Apr 11, 2025):

Understood, thank you again for Rick's response. I will first adopt the CP model for model renaming.
However, if I already have a model:main, will the CP cause a naming conflict when I update the model later?
What I mean is, do I need to delete the original model:main before copying?
I have a question, will this cause a downtime for the model provided to others?

<!-- gh-comment-id:2795617758 --> @j820301 commented on GitHub (Apr 11, 2025): Understood, thank you again for Rick's response. I will first adopt the CP model for model renaming. However, if I already have a model:main, will the CP cause a naming conflict when I update the model later? What I mean is, do I need to delete the original model:main before copying? I have a question, will this cause a downtime for the model provided to others?
Author
Owner

@rick-github commented on GitHub (Apr 11, 2025):

If the model is not being used at the time of the CP, there will be no conflict. However, if the model is in use (ie loaded into ollama) then you can get in the situation where the old and new models are both loaded at the same time. Any new requests to the ollama server will be handled by the new model, but if you have OLLAMA_KEEP_ALIVE=-1 then the old model can just hang around until it's evicted. There won't be any downtime, just the first user of the new model will experience a slight delay to their query as the model is loaded.

<!-- gh-comment-id:2796482696 --> @rick-github commented on GitHub (Apr 11, 2025): If the model is not being used at the time of the CP, there will be no conflict. However, if the model is in use (ie loaded into ollama) then you can get in the situation where the old and new models are both loaded at the same time. Any new requests to the ollama server will be handled by the new model, but if you have OLLAMA_KEEP_ALIVE=-1 then the old model can just hang around until it's evicted. There won't be any downtime, just the first user of the new model will experience a slight delay to their query as the model is loaded.
Author
Owner

@j820301 commented on GitHub (Apr 11, 2025):

Thank you, Rick, for your response. You always provide timely and effective solutions. I truly appreciate your answer and contribution. I will close this request.

If there is a mapping setting for default models or proxy models in the future, I think it would still be a great feature to have, even though it's not essential.

<!-- gh-comment-id:2797106713 --> @j820301 commented on GitHub (Apr 11, 2025): Thank you, Rick, for your response. You always provide timely and effective solutions. I truly appreciate your answer and contribution. I will close this request. If there is a mapping setting for default models or proxy models in the future, I think it would still be a great feature to have, even though it's not essential.
Author
Owner

@vansatchen commented on GitHub (Apr 11, 2025):

I resolve this issue by using nginx as proxy and module mod-http-lua
`
location /llm/ {
set $llmodel "aya-expanse:8b-q8_0";
access_by_lua_block {
ngx.req.read_body()
local body = ngx.req.get_body_data()

        if body then
            -- Replace via regexp
            local new_body, count = body:gsub(
                [["model"%s*:%s*"llmodel"]],
                [["model": "]] .. ngx.var.llmodel .. [["]]
            )

            if count > 0 then
                ngx.req.set_body_data(new_body)
                ngx.req.set_header("Content-Length", #new_body)
            end
        end
    }
    proxy_pass http://127.0.0.1:11434/;
    proxy_set_header Host localhost:11434;

    body_filter_by_lua_block {
        ngx.arg[1] = ngx.re.gsub(ngx.arg[1], ngx.var.llmodel, 'llmodel')
        -- ngx.arg[2] = true
    }
}`

This config replace "llmodel" to "aya-expanse:8b-q8_0" in my case. And replace "aya-expanse:8b-q8_0" to "llmodel" in response.
Just use address like http://host/llm/v1 to use it.

<!-- gh-comment-id:2797818749 --> @vansatchen commented on GitHub (Apr 11, 2025): I resolve this issue by using nginx as proxy and module mod-http-lua ` location /llm/ { set $llmodel "aya-expanse:8b-q8_0"; access_by_lua_block { ngx.req.read_body() local body = ngx.req.get_body_data() if body then -- Replace via regexp local new_body, count = body:gsub( [["model"%s*:%s*"llmodel"]], [["model": "]] .. ngx.var.llmodel .. [["]] ) if count > 0 then ngx.req.set_body_data(new_body) ngx.req.set_header("Content-Length", #new_body) end end } proxy_pass http://127.0.0.1:11434/; proxy_set_header Host localhost:11434; body_filter_by_lua_block { ngx.arg[1] = ngx.re.gsub(ngx.arg[1], ngx.var.llmodel, 'llmodel') -- ngx.arg[2] = true } }` This config replace "llmodel" to "aya-expanse:8b-q8_0" in my case. And replace "aya-expanse:8b-q8_0" to "llmodel" in response. Just use address like http://host/llm/v1 to use it.
Author
Owner

@j820301 commented on GitHub (Apr 14, 2025):

Hello Vansatchen, thank you for your reply. I also use nginx to configure proxy web pages, I will try this method, thank you for sharing

<!-- gh-comment-id:2800234730 --> @j820301 commented on GitHub (Apr 14, 2025): Hello Vansatchen, thank you for your reply. I also use nginx to configure proxy web pages, I will try this method, thank you for sharing
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#68752