[GH-ISSUE #6410] How can I check model's default temperature in ollama #66065

Closed
opened 2026-05-03 23:50:25 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @xugy16 on GitHub (Aug 19, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6410

I do not know how to check model's default temperature in ollama. Could anyone help with that? for example llama3.1's default temperature

Originally created by @xugy16 on GitHub (Aug 19, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6410 I do not know how to check model's default temperature in ollama. Could anyone help with that? for example llama3.1's default temperature
Author
Owner

@rick-github commented on GitHub (Aug 19, 2024):

If a default temperature is set, it will be in the parameters of the Modelfile. llama3.1 doesn't:

$ ollama show --parameters llama3.1
stop                           "<|start_header_id|>"
stop                           "<|end_header_id|>"
stop                           "<|eot_id|>"

For a model with a default temperature:

$ ollama show --parameters lstep/neuraldaredevil-8b-abliterated:q8_0
num_ctx                        8192
num_keep                       24
stop                           "<|start_header_id|>"
stop                           "<|end_header_id|>"
stop                           "<|eot_id|>"
temperature                    0.7
<!-- gh-comment-id:2296311292 --> @rick-github commented on GitHub (Aug 19, 2024): If a default temperature is set, it will be in the parameters of the Modelfile. llama3.1 doesn't: ``` $ ollama show --parameters llama3.1 stop "<|start_header_id|>" stop "<|end_header_id|>" stop "<|eot_id|>" ``` For a model with a default temperature: ``` $ ollama show --parameters lstep/neuraldaredevil-8b-abliterated:q8_0 num_ctx 8192 num_keep 24 stop "<|start_header_id|>" stop "<|end_header_id|>" stop "<|eot_id|>" temperature 0.7 ```
Author
Owner

@xugy16 commented on GitHub (Aug 19, 2024):

Thank you for your response, Rick. I know I can set temperature by modelfile. I am confusing about: @rick-github

  1. is there any command or config file I can use/check to check the temperature?
  2. So default value is 0.7 or 0.8 if I just use default llama3.1:latest
<!-- gh-comment-id:2297343580 --> @xugy16 commented on GitHub (Aug 19, 2024): Thank you for your response, Rick. I know I can set temperature by modelfile. I am confusing about: @rick-github 1. is there any command or config file I can use/check to check the temperature? 2. So default value is 0.7 or 0.8 if I just use default llama3.1:latest
Author
Owner

@rick-github commented on GitHub (Aug 19, 2024):

  1. ollama show --parameters $MODEL | grep temperature. If there's no output, there's no default temperature for the model.
  2. No. There is no temperature output from the command ollama show --parameters llama3.1, so there is no default temperature.

If there is no default temperature, the temperature is 0. You can change the temperature in the cli:

$ ollama run llama3.1
>>> /set parameter temperature 0.7
>>>

Or via the API:

$ curl localhost:11434/api/generate -d '{"model":"llama3.1","options":{"temperature":0.7},"prompt":"why is the sky blue?"}'
<!-- gh-comment-id:2297372276 --> @rick-github commented on GitHub (Aug 19, 2024): 1. `ollama show --parameters $MODEL | grep temperature`. If there's no output, there's no default temperature for the model. 2. No. There is no `temperature` output from the command `ollama show --parameters llama3.1`, so there is no default temperature. If there is no default temperature, the temperature is 0. You can change the temperature in the cli: ``` $ ollama run llama3.1 >>> /set parameter temperature 0.7 >>> ``` Or via the API: ``` $ curl localhost:11434/api/generate -d '{"model":"llama3.1","options":{"temperature":0.7},"prompt":"why is the sky blue?"}' ```
Author
Owner

@mxyng commented on GitHub (Aug 21, 2024):

If there's no temperature set for a model, either through the CLI, Modelfile, or in the request, the default is 0.8

<!-- gh-comment-id:2303049047 --> @mxyng commented on GitHub (Aug 21, 2024): If there's no temperature set for a model, either through the CLI, Modelfile, or in the request, the default is [0.8](https://github.com/ollama/ollama/blob/main/api/types.go#L592)
Author
Owner

@rick-github commented on GitHub (Aug 21, 2024):

Thanks for the correction.

<!-- gh-comment-id:2303243676 --> @rick-github commented on GitHub (Aug 21, 2024): Thanks for the correction.
Author
Owner

@mxyng commented on GitHub (Aug 21, 2024):

@xugy16 does that answer your question?

<!-- gh-comment-id:2303257361 --> @mxyng commented on GitHub (Aug 21, 2024): @xugy16 does that answer your question?
Author
Owner

@mlibre commented on GitHub (Oct 14, 2024):

That would be nice if there were CLI or API options that returned a DefaultOptions configuration.

<!-- gh-comment-id:2410220366 --> @mlibre commented on GitHub (Oct 14, 2024): That would be nice if there were CLI or API options that returned a `DefaultOptions` configuration.
Author
Owner

@athmanar commented on GitHub (Oct 23, 2024):

if a request has been made with temperature is there anyway to show the temperature request and temperature value in ollama serve terminal? I only see model params but not the temperature of the current request in the terminal.

I ask because I want to make sure that the temperature is infact honored and applied.

llama_model_loader: loaded meta data with 22 key-value pairs and 723 tensors from /home/athmanar/.ollama/models/blobs/sha256-0bd51f8f0c975ce910ed067dcb962a9af05b77bafcdc595ef02178387f10e51d (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = Meta-Llama-3-70B-Instruct
llama_model_loader: - kv   2:                          llama.block_count u32              = 80
llama_model_loader: - kv   3:                       llama.context_length u32              = 8192
llama_model_loader: - kv   4:                     llama.embedding_length u32              = 8192
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 28672
llama_model_loader: - kv   6:                 llama.attention.head_count u32              = 64
llama_model_loader: - kv   7:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv   8:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  10:                          general.file_type u32              = 2
llama_model_loader: - kv  11:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  12:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  13:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  14:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  15:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  16:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  17:                      tokenizer.ggml.merges arr[str,280147]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  18:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  19:                tokenizer.ggml.eos_token_id u32              = 128009
llama_model_loader: - kv  20:                    tokenizer.chat_template str              = {% set loop_messages = messages %}{% ...
llama_model_loader: - kv  21:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  161 tensors
llama_model_loader: - type q4_0:  561 tensors
llama_model_loader: - type q6_K:    1 tensors
time=2024-10-23T16:33:38.698-07:00 level=INFO source=server.go:621 msg="waiting for server to become available" status="llm server loading model"
<!-- gh-comment-id:2433787815 --> @athmanar commented on GitHub (Oct 23, 2024): if a request has been made with temperature is there anyway to show the temperature request and temperature value in ollama serve terminal? I only see model params but not the temperature of the current request in the terminal. I ask because I want to make sure that the temperature is infact honored and applied. ``` llama_model_loader: loaded meta data with 22 key-value pairs and 723 tensors from /home/athmanar/.ollama/models/blobs/sha256-0bd51f8f0c975ce910ed067dcb962a9af05b77bafcdc595ef02178387f10e51d (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.name str = Meta-Llama-3-70B-Instruct llama_model_loader: - kv 2: llama.block_count u32 = 80 llama_model_loader: - kv 3: llama.context_length u32 = 8192 llama_model_loader: - kv 4: llama.embedding_length u32 = 8192 llama_model_loader: - kv 5: llama.feed_forward_length u32 = 28672 llama_model_loader: - kv 6: llama.attention.head_count u32 = 64 llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8 llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000 llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 10: general.file_type u32 = 2 llama_model_loader: - kv 11: llama.vocab_size u32 = 128256 llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128 llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 17: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "... llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 128000 llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 128009 llama_model_loader: - kv 20: tokenizer.chat_template str = {% set loop_messages = messages %}{% ... llama_model_loader: - kv 21: general.quantization_version u32 = 2 llama_model_loader: - type f32: 161 tensors llama_model_loader: - type q4_0: 561 tensors llama_model_loader: - type q6_K: 1 tensors time=2024-10-23T16:33:38.698-07:00 level=INFO source=server.go:621 msg="waiting for server to become available" status="llm server loading model" ```
Author
Owner

@rick-github commented on GitHub (Oct 24, 2024):

Requests are not logged. You can run a reverse proxy (might be hard depending on OS) to log the traffic (see https://github.com/ollama/ollama/issues/6565#issuecomment-2321507680 for a simple hack). If you are running docker, you can add monitoring tools inside the container to log the traffic:

$ docker exec -it ollama bash -c 'apt update && apt install -y tcpflow'
$ docker exec -it ollama tcpflow -i any -c 'not port 11434'

You can also use tcpflow on an ollama install that uses systemd, you just need to find the port that the runner is using and pass that to the tcpflow command:

$ sudo tcpflow -c -i any "$(ps wwp$(pidof ollama_llama_server) | sed -ne 's/.*--port /port /p' | sed -ze 's/\n\(.\)/ or \1/g')"
<!-- gh-comment-id:2434043035 --> @rick-github commented on GitHub (Oct 24, 2024): Requests are not logged. You can run a reverse proxy (might be hard depending on OS) to log the traffic (see https://github.com/ollama/ollama/issues/6565#issuecomment-2321507680 for a simple hack). If you are running docker, you can add monitoring tools inside the container to log the traffic: ```sh $ docker exec -it ollama bash -c 'apt update && apt install -y tcpflow' $ docker exec -it ollama tcpflow -i any -c 'not port 11434' ``` You can also use `tcpflow` on an ollama install that uses systemd, you just need to find the port that the runner is using and pass that to the `tcpflow` command: ```sh $ sudo tcpflow -c -i any "$(ps wwp$(pidof ollama_llama_server) | sed -ne 's/.*--port /port /p' | sed -ze 's/\n\(.\)/ or \1/g')" ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#66065