ollama

mirror of https://github.com/ollama/ollama.git synced 2026-03-12 01:45:29 -05:00

Files

Jesse Gross fdb109469f llm: Allow overriding flash attention setting

As we automatically enable flash attention for more models, there
are likely some cases where we get it wrong. This allows setting
OLLAMA_FLASH_ATTENTION=0 to disable it, even for models that usually
have flash attention.

2025-10-02 12:07:20 -07:00

config_test.go

feat: add trace log level (#10650 )

2025-05-12 11:43:00 -07:00

config.go

llm: Allow overriding flash attention setting

2025-10-02 12:07:20 -07:00