[GH-ISSUE #6023] Expose unavaiable Llama-CPP flags #50280

Closed
opened 2026-04-28 14:56:26 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @doomgrave on GitHub (Jul 28, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6023

Please expose all the llama-cpp flags when we configure the modelcard.
For example: offload_kqv, flash_attn, logits_all can be needed in specific usecases!

Originally created by @doomgrave on GitHub (Jul 28, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6023 Please expose all the llama-cpp flags when we configure the modelcard. For example: offload_kqv, flash_attn, logits_all can be needed in specific usecases!
GiteaMirror added the feature request label 2026-04-28 14:56:26 -05:00
Author
Owner

@jmorganca commented on GitHub (Sep 4, 2024):

Hi @doomgrave . Thanks for the issue.

flash_attn and similar flags are available as an environment variable OLLAMA_FLASH_ATTENTION. Others can be found in respective feature requests #2415

<!-- gh-comment-id:2327757437 --> @jmorganca commented on GitHub (Sep 4, 2024): Hi @doomgrave . Thanks for the issue. `flash_attn` and similar flags are available as an environment variable `OLLAMA_FLASH_ATTENTION`. Others can be found in respective feature requests #2415
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#50280