[PR #14979] Add missing hipBLASLt kernels and some gfx targets cleanup #61647

Open
opened 2026-04-29 16:42:34 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14979
Author: @slojosic-amd
Created: 3/20/2026
Status: 🔄 Open

Base: mainHead: add_missing_hipblaslt_kernels


📝 Commits (8)

  • d9b0324 Add missing hipblaslt folder
  • 3e613d4 gfx950 target is not supported by ggml, there is no CDNA4 logic added
  • fe8788b gfx942 should be excluded from windows build
  • 4b1c961 cleanup gfx940 nad gfx941: https://github.com/ROCm/ROCm/issues/4825
  • cdfbc1e Fix removing of gfx906 rocblas kernels from Windows build
  • 95f3fb9 gfx900/gfx906 rocblas kernels are not part of official ROCm release, we should delete gfx950 rocblas kernels instead
  • 411e869 Remove gfx950 hipblaslt kernels
  • 0796abf Merge branch 'ollama:main' into add_missing_hipblaslt_kernels

📊 Changes

6 files changed (+12 additions, -9 deletions)

View changed files

📝 .github/workflows/release.yaml (+1 -1)
📝 CMakeLists.txt (+5 -3)
📝 CMakePresets.json (+2 -2)
📝 Dockerfile (+2 -1)
📝 docs/gpu.mdx (+1 -1)
📝 scripts/build_windows.ps1 (+1 -1)

📄 Description

ROCm build fixes: hipBLASLt support and GPU target cleanup

Changes

Add hipblaslt folder to install (CMakeLists.txt)

hipblaslt directory was not being copied to the ollama install folder during the HIP component install, even though it contains GPU kernels required for inference on RDNA3/RDNA4 and MI300 GPUs. Fixed by adding an explicit install(DIRECTORY ... hipblaslt ...) alongside the existing rocblas install rule.

Fix gfx906 kernel removal on Windows (build_windows.ps1, release.yaml)

The existing Remove-Item with wildcards in the path was silently failing on Windows. Replaced with Get-ChildItem -Filter | Remove-Item -Force which reliably removes the stale gfx906 rocblas kernels.

Remove unsupported GPU targets (CMakePresets.json, CMakeLists.txt)

  • Removed gfx950 (MI350X) — ggml has no CDNA4 architecture support; there are no MFMA or other MI350-specific code paths in ggml. Only gfx942 is recognized as CDNA3.
  • Removed gfx940 and gfx941 — these targets do not exist in ggml. See ROCm/ROCm#4825.
  • Excluded gfx942 (MI300X/MI300A) from Windows builds — AMD Instinct MI-series cards are not supported on Windows HIP runtime.

Fix Dockerfile cleanup (Dockerfile)

The rm -f dist/lib/ollama/rocm/rocblas/library/*gfx90[06]* line was targeting gfx900 and gfx906 kernels that no longer exist in ROCm 7.2.0. Updated to remove gfx950 kernels from both rocblas/library/ and hipblaslt/library/, which are present in the ROCm distribution but correspond to an unsupported GPU target.

Update GPU documentation (docs/gpu.mdx)

Removed gfx950 / MI350X from the list of supported Linux targets.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14979 **Author:** [@slojosic-amd](https://github.com/slojosic-amd) **Created:** 3/20/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `add_missing_hipblaslt_kernels` --- ### 📝 Commits (8) - [`d9b0324`](https://github.com/ollama/ollama/commit/d9b0324cc4f51c0ef2c4eca5630e29548ede7542) Add missing hipblaslt folder - [`3e613d4`](https://github.com/ollama/ollama/commit/3e613d4583b4db331f5fbb30a91ce03339c88f72) gfx950 target is not supported by ggml, there is no CDNA4 logic added - [`fe8788b`](https://github.com/ollama/ollama/commit/fe8788b241df75bf3fc69418c9eed79e583fdd8f) gfx942 should be excluded from windows build - [`4b1c961`](https://github.com/ollama/ollama/commit/4b1c961a7d1b2f038356d7f6d521b8f08550bdb1) cleanup gfx940 nad gfx941: https://github.com/ROCm/ROCm/issues/4825 - [`cdfbc1e`](https://github.com/ollama/ollama/commit/cdfbc1eaf3d8df4406a3e28b9c5ed4e485510a45) Fix removing of gfx906 rocblas kernels from Windows build - [`95f3fb9`](https://github.com/ollama/ollama/commit/95f3fb970c2e0ef00382d0db6a866ea5aaaf4485) gfx900/gfx906 rocblas kernels are not part of official ROCm release, we should delete gfx950 rocblas kernels instead - [`411e869`](https://github.com/ollama/ollama/commit/411e8696d33326cd540d1c5f80f8d19e3331b1ad) Remove gfx950 hipblaslt kernels - [`0796abf`](https://github.com/ollama/ollama/commit/0796abf52f001cac3ff00c0d2cd32bb012722281) Merge branch 'ollama:main' into add_missing_hipblaslt_kernels ### 📊 Changes **6 files changed** (+12 additions, -9 deletions) <details> <summary>View changed files</summary> 📝 `.github/workflows/release.yaml` (+1 -1) 📝 `CMakeLists.txt` (+5 -3) 📝 `CMakePresets.json` (+2 -2) 📝 `Dockerfile` (+2 -1) 📝 `docs/gpu.mdx` (+1 -1) 📝 `scripts/build_windows.ps1` (+1 -1) </details> ### 📄 Description ## ROCm build fixes: hipBLASLt support and GPU target cleanup ### Changes #### Add hipblaslt folder to install (`CMakeLists.txt`) `hipblaslt` directory was not being copied to the ollama install folder during the HIP component install, even though it contains GPU kernels required for inference on RDNA3/RDNA4 and MI300 GPUs. Fixed by adding an explicit `install(DIRECTORY ... hipblaslt ...)` alongside the existing `rocblas` install rule. #### Fix gfx906 kernel removal on Windows (`build_windows.ps1`, `release.yaml`) The existing `Remove-Item` with wildcards in the path was silently failing on Windows. Replaced with `Get-ChildItem -Filter | Remove-Item -Force` which reliably removes the stale gfx906 rocblas kernels. #### Remove unsupported GPU targets (`CMakePresets.json`, `CMakeLists.txt`) - Removed `gfx950` (MI350X) — ggml has no CDNA4 architecture support; there are no MFMA or other MI350-specific code paths in ggml. Only `gfx942` is recognized as CDNA3. - Removed `gfx940` and `gfx941` — these targets do not exist in ggml. See [ROCm/ROCm#4825](https://github.com/ROCm/ROCm/issues/4825). - Excluded `gfx942` (MI300X/MI300A) from Windows builds — AMD Instinct MI-series cards are not supported on Windows HIP runtime. #### Fix Dockerfile cleanup (`Dockerfile`) The `rm -f dist/lib/ollama/rocm/rocblas/library/*gfx90[06]*` line was targeting `gfx900` and `gfx906` kernels that no longer exist in ROCm 7.2.0. Updated to remove `gfx950` kernels from both `rocblas/library/` and `hipblaslt/library/`, which are present in the ROCm distribution but correspond to an unsupported GPU target. #### Update GPU documentation (`docs/gpu.mdx`) Removed `gfx950` / MI350X from the list of supported Linux targets. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 16:42:34 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#61647