[GH-ISSUE #8851] Memory access fault by GPU node-1 (Agent handle: 0x5635e3db2590) on address 0x7f189722f000. Reason: Page not present or supervisor privilege. (ollama via docker) #5738

New Issue

GiteaMirror · 2026-04-12T17:01:33-05:00

GiteaMirror commented

2026-04-12 17:01:33 -05:00

Originally created by @nicoKoehler on GitHub (Feb 5, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8851

What is the issue?

When running the ollama docker image (specs below), and trying to run any model, ollama itself gives the following error:
ollama runner process has terminated: signal: aborted (core dumped)

The docker container gives a long list of Debug messages, but the core error is:

ollama  | Memory access fault by GPU node-1 (Agent handle: 0x559ee9967590) on address 0x7fe05f439000. Reason: Page not present or supervisor privilege.
ollama  | time=2025-02-05T15:41:26.252Z level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: signal: aborted (core dumped)"

The same setup works fine with my RX 550 because it is too old for rocm, so it defaults to CPU. Such is my assumption anyways

The memory access fault error is not unique to ollama, but also occurs when trying to run the rocm/pytorch containers. I will enquire there as well, but was hoping that folks her perhaps had some ideas.

docker-compose.yml

services:

  ollama:
    image: ollama/ollama:rocm
    container_name: ollama
    privileged: true
    environment:
      HSA_OVERRIDE_GFX_VERSION: 11.0.0
      AMD_SERIALIZE_KERNEL: 3
      HIP_VISIBLE_DEVICES: 0
      OLLAMA_DEBUG: 1
      AMD_LOG_LEVEL: 3
    devices:
      - "/dev/kfd"
      - "/dev/dri"
    security_opt:
      - seccomp:unconfined
    cap_add:
      - SYS_PTRACE
    ipc: host
    group_add:
      - video
    volumes:
      - /home/user/.ollama:/root/.ollama
      - /home/user/ollama/models:/usr/share/ollama
    ports:
      - "11434:11434"

rocm-info

ROCk module is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    11th Gen Intel(R) Core(TM) i7-11700 @ 2.50GHz
  Uuid:                    CPU-XX                             
  Marketing Name:          11th Gen Intel(R) Core(TM) i7-11700 @ 2.50GHz
  Vendor Name:             CPU                                
  Feature:                 None specified                     
  Profile:                 FULL_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        0(0x0)                             
  Queue Min Size:          0(0x0)                             
  Queue Max Size:          0(0x0)                             
  Queue Type:              MULTI                              
  Node:                    0                                  
  Device Type:             CPU                                
  Cache Info:              
    L1:                      49152(0xc000) KB                   
  Chip ID:                 0(0x0)                             
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   4800                               
  BDFID:                   0                                  
  Internal Node ID:        0                                  
  Compute Unit:            16                                 
  SIMDs per CU:            0                                  
  Shader Engines:          0                                  
  Shader Arrs. per Eng.:   0                                  
  WatchPts on Addr. Ranges:1                                  
  Features:                None
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: FINE GRAINED        
      Size:                    16080720(0xf55f50) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 2                   
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    16080720(0xf55f50) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
    Pool 3                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    16080720(0xf55f50) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       TRUE                               
  ISA Info:                
*******                  
Agent 2                  
*******                  
  Name:                    gfx1100                            
  Uuid:                    GPU-XX                             
  Marketing Name:          AMD Radeon RX 6700 XT              
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          64(0x40)                           
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
    L2:                      3072(0xc00) KB                     
    L3:                      98304(0x18000) KB                  
  Chip ID:                 29663(0x73df)                      
  ASIC Revision:           0(0x0)                             
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   2855                               
  BDFID:                   1024                               
  Internal Node ID:        1                                  
  Compute Unit:            40                                 
  SIMDs per CU:            2                                  
  Shader Engines:          2                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      TRUE                               
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        32(0x20)                           
  Max Work-item Per CU:    1024(0x400)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    12566528(0xbfc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1100         
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32

OS

Linux

GPU

AMD

CPU

Intel

Ollama version

0.5.7-0-ga420a45-dirty

Originally created by @nicoKoehler on GitHub (Feb 5, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8851 ### What is the issue? When running the ollama docker image (specs below), and trying to run any model, ollama itself gives the following error: `ollama runner process has terminated: signal: aborted (core dumped)` The docker container gives a long list of Debug messages, but the core error is: ``` ollama | Memory access fault by GPU node-1 (Agent handle: 0x559ee9967590) on address 0x7fe05f439000. Reason: Page not present or supervisor privilege. ollama | time=2025-02-05T15:41:26.252Z level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: signal: aborted (core dumped)" ``` The same setup works fine with my RX 550 because it is too old for rocm, so it defaults to CPU. Such is my assumption anyways The memory access fault error is not unique to ollama, but also occurs when trying to run the rocm/pytorch containers. I will enquire there as well, but was hoping that folks her perhaps had some ideas. docker-compose.yml ``` services: ollama: image: ollama/ollama:rocm container_name: ollama privileged: true environment: HSA_OVERRIDE_GFX_VERSION: 11.0.0 AMD_SERIALIZE_KERNEL: 3 HIP_VISIBLE_DEVICES: 0 OLLAMA_DEBUG: 1 AMD_LOG_LEVEL: 3 devices: - "/dev/kfd" - "/dev/dri" security_opt: - seccomp:unconfined cap_add: - SYS_PTRACE ipc: host group_add: - video volumes: - /home/user/.ollama:/root/.ollama - /home/user/ollama/models:/usr/share/ollama ports: - "11434:11434" ``` rocm-info ``` ROCk module is loaded ===================== HSA System Attributes ===================== Runtime Version: 1.1 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) Machine Model: LARGE System Endianness: LITTLE ========== HSA Agents ========== ******* Agent 1 ******* Name: 11th Gen Intel(R) Core(TM) i7-11700 @ 2.50GHz Uuid: CPU-XX Marketing Name: 11th Gen Intel(R) Core(TM) i7-11700 @ 2.50GHz Vendor Name: CPU Feature: None specified Profile: FULL_PROFILE Float Round Mode: NEAR Max Queue Number: 0(0x0) Queue Min Size: 0(0x0) Queue Max Size: 0(0x0) Queue Type: MULTI Node: 0 Device Type: CPU Cache Info: L1: 49152(0xc000) KB Chip ID: 0(0x0) ASIC Revision: 0(0x0) Cacheline Size: 64(0x40) Max Clock Freq. (MHz): 4800 BDFID: 0 Internal Node ID: 0 Compute Unit: 16 SIMDs per CU: 0 Shader Engines: 0 Shader Arrs. per Eng.: 0 WatchPts on Addr. Ranges:1 Features: None Pool Info: Pool 1 Segment: GLOBAL; FLAGS: FINE GRAINED Size: 16080720(0xf55f50) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Alignment: 4KB Accessible by all: TRUE Pool 2 Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED Size: 16080720(0xf55f50) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Alignment: 4KB Accessible by all: TRUE Pool 3 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 16080720(0xf55f50) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Alignment: 4KB Accessible by all: TRUE ISA Info: ******* Agent 2 ******* Name: gfx1100 Uuid: GPU-XX Marketing Name: AMD Radeon RX 6700 XT Vendor Name: AMD Feature: KERNEL_DISPATCH Profile: BASE_PROFILE Float Round Mode: NEAR Max Queue Number: 128(0x80) Queue Min Size: 64(0x40) Queue Max Size: 131072(0x20000) Queue Type: MULTI Node: 1 Device Type: GPU Cache Info: L1: 16(0x10) KB L2: 3072(0xc00) KB L3: 98304(0x18000) KB Chip ID: 29663(0x73df) ASIC Revision: 0(0x0) Cacheline Size: 64(0x40) Max Clock Freq. (MHz): 2855 BDFID: 1024 Internal Node ID: 1 Compute Unit: 40 SIMDs per CU: 2 Shader Engines: 2 Shader Arrs. per Eng.: 2 WatchPts on Addr. Ranges:4 Features: KERNEL_DISPATCH Fast F16 Operation: TRUE Wavefront Size: 32(0x20) Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Max Waves Per CU: 32(0x20) Max Work-item Per CU: 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) Max fbarriers/Workgrp: 32 Pool Info: Pool 1 Segment: GLOBAL; FLAGS: COARSE GRAINED Size: 12566528(0xbfc000) KB Allocatable: TRUE Alloc Granule: 4KB Alloc Alignment: 4KB Accessible by all: FALSE Pool 2 Segment: GROUP Size: 64(0x40) KB Allocatable: FALSE Alloc Granule: 0KB Alloc Alignment: 0KB Accessible by all: FALSE ISA Info: ISA 1 Name: amdgcn-amd-amdhsa--gfx1100 Machine Models: HSA_MACHINE_MODEL_LARGE Profiles: HSA_PROFILE_BASE Default Rounding Mode: NEAR Default Rounding Mode: NEAR Fast f16: TRUE Workgroup Max Size: 1024(0x400) Workgroup Max Size per Dimension: x 1024(0x400) y 1024(0x400) z 1024(0x400) Grid Max Size: 4294967295(0xffffffff) Grid Max Size per Dimension: x 4294967295(0xffffffff) y 4294967295(0xffffffff) z 4294967295(0xffffffff) FBarrier Max Size: 32 ``` ### OS Linux ### GPU AMD ### CPU Intel ### Ollama version 0.5.7-0-ga420a45-dirty

GiteaMirror added the amd bug labels 2026-04-12 17:01:33 -05:00

GiteaMirror commented

2026-04-12 17:01:35 -05:00

@sunarowicz commented on GitHub (May 2, 2025):

Same problem here.

Ollama version: 0.6.6 (installed in system, not container by official install script)
GPU: iGP 780M in Rzyen 7 78003D (16GB VRAM reserved in BIOS)
ROCm installed
System variables used: HSA_OVERRIDE_GFX_VERSION=11.0.0 AMD_LOG_LEVEL=3 and OLLAMA_DEBUG=1

rocminfo (truncated):

ROCk module version 6.12.12 is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.15
Runtime Ext Version:     1.7
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             
Mwaitx:                  DISABLED
XNACK enabled:           NO
DMAbuf Support:          YES
VMM Support:             YES
...
*******                  
Agent 2                  
*******                  
  Name:                    gfx1036                            
  Uuid:                    GPU-XX                             
  Marketing Name:          AMD Radeon Graphics                
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH
...

ollama server messages (truncated):

load_tensors: layer   0 assigned to device ROCm0, is_swa = 0
load_tensors: layer   1 assigned to device ROCm0, is_swa = 0
load_tensors: layer   2 assigned to device ROCm0, is_swa = 0
load_tensors: layer   3 assigned to device ROCm0, is_swa = 0
load_tensors: layer   4 assigned to device ROCm0, is_swa = 0
load_tensors: layer   5 assigned to device ROCm0, is_swa = 0
load_tensors: layer   6 assigned to device ROCm0, is_swa = 0
load_tensors: layer   7 assigned to device ROCm0, is_swa = 0
load_tensors: layer   8 assigned to device ROCm0, is_swa = 0
load_tensors: layer   9 assigned to device ROCm0, is_swa = 0
load_tensors: layer  10 assigned to device ROCm0, is_swa = 0
load_tensors: layer  11 assigned to device ROCm0, is_swa = 0
load_tensors: layer  12 assigned to device ROCm0, is_swa = 0
load_tensors: layer  13 assigned to device ROCm0, is_swa = 0
load_tensors: layer  14 assigned to device ROCm0, is_swa = 0
load_tensors: layer  15 assigned to device ROCm0, is_swa = 0
load_tensors: layer  16 assigned to device ROCm0, is_swa = 0
load_tensors: layer  17 assigned to device ROCm0, is_swa = 0
load_tensors: layer  18 assigned to device ROCm0, is_swa = 0
load_tensors: layer  19 assigned to device ROCm0, is_swa = 0
load_tensors: layer  20 assigned to device ROCm0, is_swa = 0
load_tensors: layer  21 assigned to device ROCm0, is_swa = 0
load_tensors: layer  22 assigned to device ROCm0, is_swa = 0
load_tensors: layer  23 assigned to device ROCm0, is_swa = 0
load_tensors: layer  24 assigned to device ROCm0, is_swa = 0
load_tensors: layer  25 assigned to device ROCm0, is_swa = 0
load_tensors: layer  26 assigned to device ROCm0, is_swa = 0
load_tensors: layer  27 assigned to device ROCm0, is_swa = 0
load_tensors: layer  28 assigned to device ROCm0, is_swa = 0
load_tensors: tensor 'token_embd.weight' (q4_K) (and 0 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
time=2025-05-02T18:12:30.500+02:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model"
:3:hip_device_runtime.cpp   :634 : 152053957662d us:   hipGetDevice ( 0x70ce877fd71c ) 
:3:hip_device_runtime.cpp   :642 : 152053957671d us:  hipGetDevice: Returned hipSuccess : 
:3:hip_memory.cpp           :845 : 152053957675d us:   hipMemGetInfo ( 0x70ce877fd930, 0x70ce877fd938 ) 
:3:hip_memory.cpp           :869 : 152053957684d us:  hipMemGetInfo: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :634 : 152053957695d us:   hipGetDevice ( 0x70ce877fd5cc ) 
:3:hip_device_runtime.cpp   :642 : 152053957696d us:  hipGetDevice: Returned hipSuccess : 
:3:hip_device_runtime.cpp   :634 : 152053957697d us:   hipGetDevice ( 0x70ce877fd5cc ) 
:3:hip_device_runtime.cpp   :642 : 152053957698d us:  hipGetDevice: Returned hipSuccess : 
:3:hip_memory.cpp           :703 : 152053957706d us:   hipMalloc ( 0x70ce877fd628, 980104704 ) 
:3:rocdevice.cpp            :2425: 152054017710d us:  Device=0x5d8d5aef70f0, freeMem_ = 0xd7c15da00
:3:hip_memory.cpp           :705 : 152054017732d us:  hipMalloc: Returned hipSuccess : 0x70cd12c00000: duration: 60026d us
:3:hip_device_runtime.cpp   :634 : 152054017748d us:   hipGetDevice ( 0x70ce877fd62c ) 
:3:hip_device_runtime.cpp   :642 : 152054017750d us:  hipGetDevice: Returned hipSuccess : 
:3:hip_memory.cpp           :3310: 152054017755d us:   hipMemset ( 0x70cd1f7cba00, 0, 210 ) 
:3:rocdevice.cpp            :3064: 152054017762d us:  Number of allocated hardware queues with low priority: 0, with normal priority: 0, with high priority: 0, maximum per priority is: 4
:3:rocdevice.cpp            :3140: 152054027419d us:  Created SWq=0x70ce9c0d8000 to map on HWq=0x70ce74100000 with size 16384 with priority 1, cooperative: 0
:3:rocdevice.cpp            :3232: 152054027431d us:  acquireQueue refCount: 0x70ce74100000 (1)
:3:devprogram.cpp           :2648: 152054199825d us:  Using Code Object V5.
Memory access fault by GPU node-1 (Agent handle: 0x5d8d5aeee0b0) on address 0x70cee5a30000. Reason: Page not present or supervisor privilege.
time=2025-05-02T18:12:35.513+02:00 level=ERROR source=sched.go:457 msg="error loading llama server" error="llama runner process has terminated: signal: aborted (core dumped)"

@sunarowicz commented on GitHub (May 2, 2025): Same problem here. Ollama version: 0.6.6 (installed in system, not container by official install script) GPU: iGP 780M in Rzyen 7 78003D (16GB VRAM reserved in BIOS) ROCm installed System variables used: HSA_OVERRIDE_GFX_VERSION=11.0.0 AMD_LOG_LEVEL=3 and OLLAMA_DEBUG=1 rocminfo (truncated): ``` ROCk module version 6.12.12 is loaded ===================== HSA System Attributes ===================== Runtime Version: 1.15 Runtime Ext Version: 1.7 System Timestamp Freq.: 1000.000000MHz Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) Machine Model: LARGE System Endianness: LITTLE Mwaitx: DISABLED XNACK enabled: NO DMAbuf Support: YES VMM Support: YES ... ******* Agent 2 ******* Name: gfx1036 Uuid: GPU-XX Marketing Name: AMD Radeon Graphics Vendor Name: AMD Feature: KERNEL_DISPATCH ... ``` ollama server messages (truncated): ``` load_tensors: layer 0 assigned to device ROCm0, is_swa = 0 load_tensors: layer 1 assigned to device ROCm0, is_swa = 0 load_tensors: layer 2 assigned to device ROCm0, is_swa = 0 load_tensors: layer 3 assigned to device ROCm0, is_swa = 0 load_tensors: layer 4 assigned to device ROCm0, is_swa = 0 load_tensors: layer 5 assigned to device ROCm0, is_swa = 0 load_tensors: layer 6 assigned to device ROCm0, is_swa = 0 load_tensors: layer 7 assigned to device ROCm0, is_swa = 0 load_tensors: layer 8 assigned to device ROCm0, is_swa = 0 load_tensors: layer 9 assigned to device ROCm0, is_swa = 0 load_tensors: layer 10 assigned to device ROCm0, is_swa = 0 load_tensors: layer 11 assigned to device ROCm0, is_swa = 0 load_tensors: layer 12 assigned to device ROCm0, is_swa = 0 load_tensors: layer 13 assigned to device ROCm0, is_swa = 0 load_tensors: layer 14 assigned to device ROCm0, is_swa = 0 load_tensors: layer 15 assigned to device ROCm0, is_swa = 0 load_tensors: layer 16 assigned to device ROCm0, is_swa = 0 load_tensors: layer 17 assigned to device ROCm0, is_swa = 0 load_tensors: layer 18 assigned to device ROCm0, is_swa = 0 load_tensors: layer 19 assigned to device ROCm0, is_swa = 0 load_tensors: layer 20 assigned to device ROCm0, is_swa = 0 load_tensors: layer 21 assigned to device ROCm0, is_swa = 0 load_tensors: layer 22 assigned to device ROCm0, is_swa = 0 load_tensors: layer 23 assigned to device ROCm0, is_swa = 0 load_tensors: layer 24 assigned to device ROCm0, is_swa = 0 load_tensors: layer 25 assigned to device ROCm0, is_swa = 0 load_tensors: layer 26 assigned to device ROCm0, is_swa = 0 load_tensors: layer 27 assigned to device ROCm0, is_swa = 0 load_tensors: layer 28 assigned to device ROCm0, is_swa = 0 load_tensors: tensor 'token_embd.weight' (q4_K) (and 0 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead time=2025-05-02T18:12:30.500+02:00 level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model" :3:hip_device_runtime.cpp :634 : 152053957662d us: hipGetDevice ( 0x70ce877fd71c ) :3:hip_device_runtime.cpp :642 : 152053957671d us: hipGetDevice: Returned hipSuccess : :3:hip_memory.cpp :845 : 152053957675d us: hipMemGetInfo ( 0x70ce877fd930, 0x70ce877fd938 ) :3:hip_memory.cpp :869 : 152053957684d us: hipMemGetInfo: Returned hipSuccess : :3:hip_device_runtime.cpp :634 : 152053957695d us: hipGetDevice ( 0x70ce877fd5cc ) :3:hip_device_runtime.cpp :642 : 152053957696d us: hipGetDevice: Returned hipSuccess : :3:hip_device_runtime.cpp :634 : 152053957697d us: hipGetDevice ( 0x70ce877fd5cc ) :3:hip_device_runtime.cpp :642 : 152053957698d us: hipGetDevice: Returned hipSuccess : :3:hip_memory.cpp :703 : 152053957706d us: hipMalloc ( 0x70ce877fd628, 980104704 ) :3:rocdevice.cpp :2425: 152054017710d us: Device=0x5d8d5aef70f0, freeMem_ = 0xd7c15da00 :3:hip_memory.cpp :705 : 152054017732d us: hipMalloc: Returned hipSuccess : 0x70cd12c00000: duration: 60026d us :3:hip_device_runtime.cpp :634 : 152054017748d us: hipGetDevice ( 0x70ce877fd62c ) :3:hip_device_runtime.cpp :642 : 152054017750d us: hipGetDevice: Returned hipSuccess : :3:hip_memory.cpp :3310: 152054017755d us: hipMemset ( 0x70cd1f7cba00, 0, 210 ) :3:rocdevice.cpp :3064: 152054017762d us: Number of allocated hardware queues with low priority: 0, with normal priority: 0, with high priority: 0, maximum per priority is: 4 :3:rocdevice.cpp :3140: 152054027419d us: Created SWq=0x70ce9c0d8000 to map on HWq=0x70ce74100000 with size 16384 with priority 1, cooperative: 0 :3:rocdevice.cpp :3232: 152054027431d us: acquireQueue refCount: 0x70ce74100000 (1) :3:devprogram.cpp :2648: 152054199825d us: Using Code Object V5. Memory access fault by GPU node-1 (Agent handle: 0x5d8d5aeee0b0) on address 0x70cee5a30000. Reason: Page not present or supervisor privilege. time=2025-05-02T18:12:35.513+02:00 level=ERROR source=sched.go:457 msg="error loading llama server" error="llama runner process has terminated: signal: aborted (core dumped)" ```

GiteaMirror commented

2026-04-12 17:01:35 -05:00

@chilman408 commented on GitHub (Jun 24, 2025):

I have the same issue with Ollama in docker running in Unraid with AMD iGPU 780m.

Let me know how I can help with debugging?

@chilman408 commented on GitHub (Jun 24, 2025): I have the same issue with Ollama in docker running in Unraid with AMD iGPU 780m. Let me know how I can help with debugging?

GiteaMirror commented

2026-04-12 17:01:36 -05:00

@0fflineuser commented on GitHub (Aug 17, 2025):

Same problem here, It used to work.

@0fflineuser commented on GitHub (Aug 17, 2025): Same problem here, It used to work.

GiteaMirror commented

2026-04-12 17:01:36 -05:00

@0fflineuser commented on GitHub (Aug 17, 2025):

Nevermind, I had HSA_OVERRIDE_GFX_VERSION set to 11.0.0 instead of 10.3.0.
Sry.

@0fflineuser commented on GitHub (Aug 17, 2025): Nevermind, I had HSA_OVERRIDE_GFX_VERSION set to 11.0.0 instead of 10.3.0. Sry.

GiteaMirror commented

2026-04-12 17:01:36 -05:00

@juanluisbaptiste commented on GitHub (Nov 29, 2025):

Nevermind, I had HSA_OVERRIDE_GFX_VERSION set to 11.0.0 instead of 10.3.0. Sry.

Where do you set that var? I have the same problem on a AMD RYZEN AI MAX+ 395 with a Radeon 8060S video card, every time I launch a query on ollama, X restarts most of the times (other times the pc completely freezes), and on the logs I see the same error.

@juanluisbaptiste commented on GitHub (Nov 29, 2025): > Nevermind, I had HSA_OVERRIDE_GFX_VERSION set to 11.0.0 instead of 10.3.0. Sry. Where do you set that var? I have the same problem on a AMD RYZEN AI MAX+ 395 with a Radeon 8060S video card, every time I launch a query on ollama, X restarts most of the times (other times the pc completely freezes), and on the logs I see the same error.

GiteaMirror commented

2026-04-12 17:01:37 -05:00

@nicoKoehler commented on GitHub (Nov 29, 2025):

@juanluisbaptiste You can set it either in a docker file, if you are using a container, or set it as an environment variable in the OS you are using.

@nicoKoehler commented on GitHub (Nov 29, 2025): @juanluisbaptiste You can set it either in a docker file, if you are using a container, or set it as an environment variable in the OS you are using.

GiteaMirror commented

2026-04-12 17:01:37 -05:00

@pcarranza commented on GitHub (Mar 1, 2026):

Thank you all, with this final detail, I managed to run ollama on podman on a Beelink SER9 PRO AMD Pro Ryzen™ AI 9 HX 370, sharing for posterity in case someone else needs this in the future:

After installing the ROCm driver, add my user to render sudo usermod -aG render $USER reboot, then

podman run --pull newer \
	--security-opt label=type:container_runtime_t \
	--group-add keep-groups \
	--replace \
	--device /dev/kfd \
	--device /dev/dri \
	-e HSA_OVERRIDE_GFX_VERSION=10.3.0 \
	-e OLLAMA_HOST=0.0.0.0:11434 \
	-e OLLAMA_MODELS=/root/.ollama/models \
	-e OLLAMA_DEBUG=2 \
	-v ollama:/root/.ollama \
	-p 11434:11434 \
	--name ollama \
	docker.io/ollama/ollama:rocm

Then in the logs:

...
time=2026-03-01T08:37:30.458Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" devices="[{DeviceID:{ID:0 Library:ROCm} Name:ROCm0 Description:AMD Radeon Graphics FilterID: Integrated:true PCIID:0000:c5:00.0 TotalMemory:35687571456 FreeMemory:35088281600 ComputeMajor:16 ComputeMinor:48 DriverMajor:60342 DriverMinor:13 LibraryPath:[/usr/lib/ollama /usr/lib/ollama/rocm]}]"
time=2026-03-01T08:37:30.458Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=1.059048769s OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]"
time=2026-03-01T08:37:30.458Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[ROCm:map[/usr/lib/ollama/rocm:map[0:0]]]
time=2026-03-01T08:37:30.458Z level=DEBUG source=runner.go:193 msg="adjusting filtering IDs" FilterID=0 new_ID=0
time=2026-03-01T08:37:30.458Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=1.859426026s
time=2026-03-01T08:37:30.458Z level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=ROCm compute=gfx1030 name=ROCm0 description="AMD Radeon Graphics" libdirs=ollama,rocm driver=60342.13 pci_id=0000:c5:00.0 type=iGPU total="33.2 GiB" available="32.7 GiB"
time=2026-03-01T08:37:30.458Z level=INFO source=routes.go:1768 msg="vram-based default context" total_vram="33.2 GiB" default_num_ctx=32768

@pcarranza commented on GitHub (Mar 1, 2026): Thank you all, with this final detail, I managed to run ollama on podman on a Beelink SER9 PRO AMD Pro Ryzen™ AI 9 HX 370, sharing for posterity in case someone else needs this in the future: After installing the ROCm driver, add my user to render `sudo usermod -aG render $USER` reboot, then ```bash podman run --pull newer \ --security-opt label=type:container_runtime_t \ --group-add keep-groups \ --replace \ --device /dev/kfd \ --device /dev/dri \ -e HSA_OVERRIDE_GFX_VERSION=10.3.0 \ -e OLLAMA_HOST=0.0.0.0:11434 \ -e OLLAMA_MODELS=/root/.ollama/models \ -e OLLAMA_DEBUG=2 \ -v ollama:/root/.ollama \ -p 11434:11434 \ --name ollama \ docker.io/ollama/ollama:rocm ``` Then in the logs: ``` ... time=2026-03-01T08:37:30.458Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" devices="[{DeviceID:{ID:0 Library:ROCm} Name:ROCm0 Description:AMD Radeon Graphics FilterID: Integrated:true PCIID:0000:c5:00.0 TotalMemory:35687571456 FreeMemory:35088281600 ComputeMajor:16 ComputeMinor:48 DriverMajor:60342 DriverMinor:13 LibraryPath:[/usr/lib/ollama /usr/lib/ollama/rocm]}]" time=2026-03-01T08:37:30.458Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=1.059048769s OLLAMA_LIBRARY_PATH="[/usr/lib/ollama /usr/lib/ollama/rocm]" extra_envs="map[GGML_CUDA_INIT:1 ROCR_VISIBLE_DEVICES:0]" time=2026-03-01T08:37:30.458Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[ROCm:map[/usr/lib/ollama/rocm:map[0:0]]] time=2026-03-01T08:37:30.458Z level=DEBUG source=runner.go:193 msg="adjusting filtering IDs" FilterID=0 new_ID=0 time=2026-03-01T08:37:30.458Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=1.859426026s time=2026-03-01T08:37:30.458Z level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=ROCm compute=gfx1030 name=ROCm0 description="AMD Radeon Graphics" libdirs=ollama,rocm driver=60342.13 pci_id=0000:c5:00.0 type=iGPU total="33.2 GiB" available="32.7 GiB" time=2026-03-01T08:37:30.458Z level=INFO source=routes.go:1768 msg="vram-based default context" total_vram="33.2 GiB" default_num_ctx=32768 ```

GiteaMirror referenced this issue

2026-04-22 08:06:08 -05:00

[GH-ISSUE #5738] How can I make the model produce consistent and stable results for the same prompt? #29333

GiteaMirror referenced this issue

2026-04-28 14:01:02 -05:00

[GH-ISSUE #5738] How can I make the model produce consistent and stable results for the same prompt? #50084

GiteaMirror referenced this issue

2026-05-03 21:51:53 -05:00

[GH-ISSUE #5738] How can I make the model produce consistent and stable results for the same prompt? #65610

Sign in to join this conversation.

Branches Tags

main

dhiltgen/ci

dhiltgen/llama-runner

hoyyeva/anthropic-local-image-path

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#5738