Text Generation Documentation #3715

Open
opened 2025-11-12 11:50:06 -06:00 by GiteaMirror · 5 comments
Owner

Originally created by @Demirrr on GitHub (Jul 25, 2024).

Dear all,

we ❤️ Ollama. Thank you for this great framework. I wa

There are many parameters for text generation Many of such parameters overlap with llama.ccp, while few of them do not, e.g.

  1. num_thread
  2. repeat_last_n
  3. num_batch
  4. f16_kv

I guess It would be great if you guys could write few sentences.

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Why is the sky blue?",
  "stream": false,
  "options": {
    "num_keep": 5,
    "seed": 42,
    "num_predict": 100,
    "top_k": 20,
    "top_p": 0.9,
    "tfs_z": 0.5,
    "typical_p": 0.7,
    "repeat_last_n": 33,
    "temperature": 0.8,
    "repeat_penalty": 1.2,
    "presence_penalty": 1.5,
    "frequency_penalty": 1.0,
    "mirostat": 1,
    "mirostat_tau": 0.8,
    "mirostat_eta": 0.6,
    "penalize_newline": true,
    "stop": ["\n", "user:"],
    "numa": false,
    "num_ctx": 1024,
    "num_batch": 2,
    "num_gpu": 1,
    "main_gpu": 0,
    "low_vram": false,
    "f16_kv": true,
    "vocab_only": false,
    "use_mmap": true,
    "use_mlock": false,
    "num_thread": 8
  }
}'
Originally created by @Demirrr on GitHub (Jul 25, 2024). Dear all, we ❤️ Ollama. Thank you for this great framework. I wa There are many parameters for text generation Many of such parameters overlap with [llama.ccp](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md#common-options), while few of them do not, e.g. 1. num_thread 2. repeat_last_n 3. num_batch 4. f16_kv I guess It would be great if you guys could write few sentences. ``` curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt": "Why is the sky blue?", "stream": false, "options": { "num_keep": 5, "seed": 42, "num_predict": 100, "top_k": 20, "top_p": 0.9, "tfs_z": 0.5, "typical_p": 0.7, "repeat_last_n": 33, "temperature": 0.8, "repeat_penalty": 1.2, "presence_penalty": 1.5, "frequency_penalty": 1.0, "mirostat": 1, "mirostat_tau": 0.8, "mirostat_eta": 0.6, "penalize_newline": true, "stop": ["\n", "user:"], "numa": false, "num_ctx": 1024, "num_batch": 2, "num_gpu": 1, "main_gpu": 0, "low_vram": false, "f16_kv": true, "vocab_only": false, "use_mmap": true, "use_mlock": false, "num_thread": 8 } }' ```
GiteaMirror added the apifeature requestdocumentation labels 2025-11-12 11:50:07 -06:00
Author
Owner

@rick-github commented on GitHub (Jul 25, 2024):

num_thread is the same as llama.cpp --threads parameter.
repeat_last_n is --repeat-last-n.
num_batch is --batch-size
f16_kv is the inverse of --memory-f32. If f16_kv is set, --memory-f32 is not set. The default value for f16_kv is true.

@rick-github commented on GitHub (Jul 25, 2024): `num_thread` is the same as llama.cpp [`--threads`](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md#:~:text=%2Dt%20N%2C%20%2D%2Dthreads%20N%3A%20Set%20the%20number%20of%20threads%20to%20use%20during%20generation.%20For%20optimal%20performance%2C%20it%20is%20recommended%20to%20set%20this%20value%20to%20the%20number%20of%20physical%20CPU%20cores%20your%20system%20has.) parameter. `repeat_last_n` is [`--repeat-last-n`](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md#:~:text=%2D%2Drepeat%2Dlast%2Dn%20N%3A%20Last%20n%20tokens%20to%20consider%20for%20penalizing%20repetition%20(default%3A%2064%2C%200%20%3D%20disabled%2C%20%2D1%20%3D%20ctx%2Dsize).). `num_batch` is [`--batch-size`](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md#:~:text=%2Db%20N%2C%20%2D%2Dbatch%2Dsize%20N%3A%20Set%20the%20batch%20size%20for%20prompt%20processing%20(default%3A%202048).) `f16_kv` is the inverse of [`--memory-f32`](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md#:~:text=%2D%2Dmemory%2Df32%3A%20Use%2032%2Dbit%20floats%20instead%20of%2016%2Dbit%20floats%20for%20memory%20key%2Bvalue). If `f16_kv` is set, `--memory-f32` is not set. The default value for `f16_kv` is `true`.
Author
Owner

@Demirrr commented on GitHub (Jul 25, 2024):

Perfect! Thank you !

@Demirrr commented on GitHub (Jul 25, 2024): Perfect! Thank you !
Author
Owner

@Demirrr commented on GitHub (Jul 25, 2024):

@rick-github what about vocab_only ?

@Demirrr commented on GitHub (Jul 25, 2024): @rick-github what about vocab_only ?
Author
Owner

@rick-github commented on GitHub (Jul 25, 2024):

As far as I can tell, vocab_only is not currently used. It may be a deprecated option, or a placeholder for future use.

@rick-github commented on GitHub (Jul 25, 2024): As far as I can tell, `vocab_only` is not currently used. It may be a deprecated option, or a placeholder for future use.
Author
Owner

@Demirrr commented on GitHub (Jul 26, 2024):

Great thank you!

@Demirrr commented on GitHub (Jul 26, 2024): Great thank you!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama-ollama#3715