[PR #12527] [MERGED] discover: Disable flash attention for Jetson Xavier (CC 7.2) #39733

Closed
opened 2026-04-23 00:44:00 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/12527
Author: @jessegross
Created: 10/7/2025
Status: Merged
Merged: 10/8/2025
Merged by: @jessegross

Base: mainHead: jessegross/xavier


📝 Commits (1)

  • 06b7ee7 discover: Disable flash attention for Jetson Xavier (CC 7.2)

📊 Changes

3 files changed (+17 additions, -13 deletions)

View changed files

📝 discover/gpu.go (+4 -8)
📝 discover/types.go (+5 -4)
📝 llm/memory.go (+8 -1)

📄 Description

GGML picks the wrong kernel and these systems fail with: Sep 28 22:25:39 xavier ollama[48999]: //ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu:437: ERROR: CUDA kernel flash_attn_ext_f16 has no device code compatible with CUDA arch 720. ggml-cuda.cu was compiled for: CUDA_ARCH_LIST

Fixes #12442


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/12527 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 10/7/2025 **Status:** ✅ Merged **Merged:** 10/8/2025 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/xavier` --- ### 📝 Commits (1) - [`06b7ee7`](https://github.com/ollama/ollama/commit/06b7ee7781a8d0c3121e9057285dffd79f2ca441) discover: Disable flash attention for Jetson Xavier (CC 7.2) ### 📊 Changes **3 files changed** (+17 additions, -13 deletions) <details> <summary>View changed files</summary> 📝 `discover/gpu.go` (+4 -8) 📝 `discover/types.go` (+5 -4) 📝 `llm/memory.go` (+8 -1) </details> ### 📄 Description GGML picks the wrong kernel and these systems fail with: Sep 28 22:25:39 xavier ollama[48999]: //ml/backend/ggml/ggml/src/ggml-cuda/fattn-wmma-f16.cu:437: ERROR: CUDA kernel flash_attn_ext_f16 has no device code compatible with CUDA arch 720. ggml-cuda.cu was compiled for: __CUDA_ARCH_LIST__ Fixes #12442 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-23 00:44:00 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#39733