[GH-ISSUE #12099] Feature Request: Add CLI flags for generation parameters in ollama run #33801

Closed
opened 2026-04-22 16:49:25 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @Kwiatek47 on GitHub (Aug 27, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12099

Right now, if I want to change parameters like temperature, top_p, or num_predict, I need to edit a Modelfile or call the API with JSON. This makes quick testing and prototyping harder from the command line. Something similar to Hugging Face pipelines and other ML tools.
Example:
ollama run --temperature 0.0 gpt-oss:20b
ollama run -t 0.7 -p 0.9 --max-tokens 256 gpt-oss:20b

Originally created by @Kwiatek47 on GitHub (Aug 27, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12099 Right now, if I want to change parameters like temperature, top_p, or num_predict, I need to edit a Modelfile or call the API with JSON. This makes quick testing and prototyping harder from the command line. Something similar to Hugging Face pipelines and other ML tools. Example: `ollama run --temperature 0.0 gpt-oss:20b` `ollama run -t 0.7 -p 0.9 --max-tokens 256 gpt-oss:20b`
GiteaMirror added the feature request label 2026-04-22 16:49:25 -05:00
Author
Owner

@rick-github commented on GitHub (Aug 27, 2025):

#5362
https://github.com/ollama/ollama/issues/9723#issuecomment-2819548324

<!-- gh-comment-id:3229003297 --> @rick-github commented on GitHub (Aug 27, 2025): #5362 https://github.com/ollama/ollama/issues/9723#issuecomment-2819548324
Author
Owner

@pdevine commented on GitHub (Aug 27, 2025):

If you're running in the CLI you can also use /set parameter temperature 0. There's online help in the CLI if you use:

>>> /set parameter
Available Parameters:
  /set parameter seed <int>             Random number seed
  /set parameter num_predict <int>      Max number of tokens to predict
  /set parameter top_k <int>            Pick from top k num of tokens
  /set parameter top_p <float>          Pick token based on sum of probabilities
  /set parameter min_p <float>          Pick token based on top token probability * min_p
  /set parameter num_ctx <int>          Set the context size
  /set parameter temperature <float>    Set creativity level
  /set parameter repeat_penalty <float> How strongly to penalize repetitions
  /set parameter repeat_last_n <int>    Set how far back to look for repetitions
  /set parameter num_gpu <int>          The number of layers to send to the GPU
  /set parameter stop <string> <string> ...   Set the stop parameters

I'll close this as a dupe.

<!-- gh-comment-id:3229237505 --> @pdevine commented on GitHub (Aug 27, 2025): If you're running in the CLI you can also use `/set parameter temperature 0`. There's online help in the CLI if you use: ``` >>> /set parameter Available Parameters: /set parameter seed <int> Random number seed /set parameter num_predict <int> Max number of tokens to predict /set parameter top_k <int> Pick from top k num of tokens /set parameter top_p <float> Pick token based on sum of probabilities /set parameter min_p <float> Pick token based on top token probability * min_p /set parameter num_ctx <int> Set the context size /set parameter temperature <float> Set creativity level /set parameter repeat_penalty <float> How strongly to penalize repetitions /set parameter repeat_last_n <int> Set how far back to look for repetitions /set parameter num_gpu <int> The number of layers to send to the GPU /set parameter stop <string> <string> ... Set the stop parameters ``` I'll close this as a dupe.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#33801