[PR #11862] discover: CPU supports flash attention #13637

Closed
opened 2026-04-13 00:31:44 -05:00 by GiteaMirror · 0 comments
Owner

Original Pull Request: https://github.com/ollama/ollama/pull/11862

State: closed
Merged: Yes


We already run flash attention on CPUs in cases where we have partial offloading but were disabling it if running on pure CPU, which is unnecessary.

**Original Pull Request:** https://github.com/ollama/ollama/pull/11862 **State:** closed **Merged:** Yes --- We already run flash attention on CPUs in cases where we have partial offloading but were disabling it if running on pure CPU, which is unnecessary.
GiteaMirror added the pull-request label 2026-04-13 00:31:44 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#13637