[PR #14445] Add AMDGPU targets gfx1150/gfx1151 for AMD Strix Halo (RDNA 3.5) support #40547

Open
opened 2026-04-23 01:25:28 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14445
Author: @Colossus14
Created: 2/26/2026
Status: 🔄 Open

Base: mainHead: add-gfx1150-1151-support


📝 Commits (1)

  • 6f0e5e2 Add AMDGPU targets gfx1150/gfx1151 (AMD Strix Halo RDNA 3.5)

📊 Changes

1 file changed (+1 additions, -1 deletions)

View changed files

📝 CMakeLists.txt (+1 -1)

📄 Description

Summary

Add gfx1150 and gfx1151 to the AMDGPU_TARGETS filter regex in CMakeLists.txt, enabling ROCm/HIP GPU acceleration on AMD Strix Halo APUs.

Change

One-line regex addition:

- list(FILTER AMDGPU_TARGETS INCLUDE REGEX "^gfx(94[012]|101[02]|1030|110[012]|120[01])$")
+ list(FILTER AMDGPU_TARGETS INCLUDE REGEX "^gfx(94[012]|101[02]|1030|110[012]|115[01]|120[01])$")

What hardware does this enable?

AMD Strix Halo APUs ship with integrated RDNA 3.5 GPUs that report as gfx1150 or gfx1151:

Product GPU Compute Target Memory
AMD Ryzen AI MAX+ 395 Radeon 8060S gfx1151 Up to 128 GB shared (iGPU uses system RAM)
AMD Ryzen AI MAX 390 Radeon 8060S gfx1151 Up to 128 GB shared
AMD Ryzen AI MAX+ 385 Radeon 8050S gfx1150/1151 Up to 128 GB shared

These are high-end laptop/workstation APUs with large shared memory pools (up to 128 GB addressable by the GPU), making them well-suited for running large language models locally. Without this patch, Ollama's HIP backend is not compiled for these targets and falls back to CPU-only inference.

Test environment

Component Detail
CPU/APU AMD Ryzen AI MAX+ 395 w/ Radeon 8060S
GPU compute target gfx1151 (RDNA 3.5)
GPU memory 111.5 GiB GTT (shared system memory)
OS Fedora Linux 43 (kernel 6.18.10)
ROCm 6.4.2 (Fedora packages)
HIP 6.4.43484 (clang 19.0.0)

Test results

Built from this branch, Ollama correctly detects the GPU:

inference compute id=0 library=ROCm compute=gfx1151 name=ROCm0
  description="Radeon 8060S Graphics" type=iGPU
  total="111.5 GiB" available="111.3 GiB"

Inference test with qwen3.5:35b (23 GB model, fully GPU-offloaded):

Metric Value
Model load time 11.76 seconds
Inference speed 42.5 tokens/sec
Output tokens 448
Total duration 10.9 seconds
VRAM used 32.2 GiB
Context window 262,144 tokens

Runtime note

Users currently need to set HSA_OVERRIDE_GFX_VERSION=11.5.1 at runtime for the ROCm HSA runtime to recognize gfx1151. This is a ROCm-side limitation (gfx1151 is not yet in ROCm's official support matrix), not an Ollama issue. This will likely become unnecessary as ROCm adds native Strix Halo support.

This follows the same pattern as existing targets in the filter: gfx1100/1101/1102 (RDNA 3) and gfx1200/1201 (RDNA 4) are already included. gfx1150/1151 (RDNA 3.5) fills the gap between the two generations.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14445 **Author:** [@Colossus14](https://github.com/Colossus14) **Created:** 2/26/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `add-gfx1150-1151-support` --- ### 📝 Commits (1) - [`6f0e5e2`](https://github.com/ollama/ollama/commit/6f0e5e2e1a060a23b9155dccf26520899470ede6) Add AMDGPU targets gfx1150/gfx1151 (AMD Strix Halo RDNA 3.5) ### 📊 Changes **1 file changed** (+1 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `CMakeLists.txt` (+1 -1) </details> ### 📄 Description ## Summary Add `gfx1150` and `gfx1151` to the `AMDGPU_TARGETS` filter regex in `CMakeLists.txt`, enabling ROCm/HIP GPU acceleration on AMD Strix Halo APUs. ## Change One-line regex addition: ```diff - list(FILTER AMDGPU_TARGETS INCLUDE REGEX "^gfx(94[012]|101[02]|1030|110[012]|120[01])$") + list(FILTER AMDGPU_TARGETS INCLUDE REGEX "^gfx(94[012]|101[02]|1030|110[012]|115[01]|120[01])$") ``` ## What hardware does this enable? AMD **Strix Halo** APUs ship with integrated RDNA 3.5 GPUs that report as `gfx1150` or `gfx1151`: | Product | GPU | Compute Target | Memory | |---------|-----|---------------|--------| | AMD Ryzen AI MAX+ 395 | Radeon 8060S | gfx1151 | Up to 128 GB shared (iGPU uses system RAM) | | AMD Ryzen AI MAX 390 | Radeon 8060S | gfx1151 | Up to 128 GB shared | | AMD Ryzen AI MAX+ 385 | Radeon 8050S | gfx1150/1151 | Up to 128 GB shared | These are high-end laptop/workstation APUs with large shared memory pools (up to 128 GB addressable by the GPU), making them well-suited for running large language models locally. Without this patch, Ollama's HIP backend is not compiled for these targets and falls back to CPU-only inference. ## Test environment | Component | Detail | |-----------|--------| | **CPU/APU** | AMD Ryzen AI MAX+ 395 w/ Radeon 8060S | | **GPU compute target** | gfx1151 (RDNA 3.5) | | **GPU memory** | 111.5 GiB GTT (shared system memory) | | **OS** | Fedora Linux 43 (kernel 6.18.10) | | **ROCm** | 6.4.2 (Fedora packages) | | **HIP** | 6.4.43484 (clang 19.0.0) | ## Test results Built from this branch, Ollama correctly detects the GPU: ``` inference compute id=0 library=ROCm compute=gfx1151 name=ROCm0 description="Radeon 8060S Graphics" type=iGPU total="111.5 GiB" available="111.3 GiB" ``` Inference test with `qwen3.5:35b` (23 GB model, fully GPU-offloaded): | Metric | Value | |--------|-------| | Model load time | 11.76 seconds | | Inference speed | **42.5 tokens/sec** | | Output tokens | 448 | | Total duration | 10.9 seconds | | VRAM used | 32.2 GiB | | Context window | 262,144 tokens | ## Runtime note Users currently need to set `HSA_OVERRIDE_GFX_VERSION=11.5.1` at runtime for the ROCm HSA runtime to recognize gfx1151. This is a ROCm-side limitation (gfx1151 is not yet in ROCm's official support matrix), not an Ollama issue. This will likely become unnecessary as ROCm adds native Strix Halo support. ## Related This follows the same pattern as existing targets in the filter: gfx1100/1101/1102 (RDNA 3) and gfx1200/1201 (RDNA 4) are already included. gfx1150/1151 (RDNA 3.5) fills the gap between the two generations. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-23 01:25:28 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#40547