[GH-ISSUE #5450] Inference fails on AMD when using >1 GPU. #65443

New Issue

GiteaMirror · 2026-05-03T21:18:03-05:00

GiteaMirror commented

2026-05-03 21:18:03 -05:00

Originally created by @Speedway1 on GitHub (Jul 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5450

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

This is on AMD. I have 2 x Radeon 7900 XCX cards (24gb each).

For models/memory use that only uses 1 GPU, everything works fine.
As soon as both cards are required, the inference fails with garbage. As seen in this output:

ollama@TH-AI2:~$ ollama list
NAME                                    ID              SIZE    MODIFIED    
deepseek-coder-v2:latest                8577f96d693e    8.9 GB  10 days ago
codestral:latest                        fcc0019dcee9    12 GB   11 days ago
qwen2:latest                            e0d4e1163c58    4.4 GB  11 days ago
command-r:latest                        b8cdfff0263c    20 GB   11 days ago
mxbai-embed-large:latest                468836162de7    669 MB  11 days ago
llama3:70b                              786f3184aec0    39 GB   11 days ago
phi3:14b-medium-128k-instruct-f16       e89861c3ba63    27 GB   11 days ago
ollama@TH-AI2:~$ ollama run command-r:latest
>>> Hello how are you?
???????????????????????????????

>>> /bye

Codetral is only 12 GB and runs in 1GPU, it works fine:

ollama@TH-AI2:~$ ollama run command-r:latest
>>> Hello how are you?
???????????????????????????????

>>> /bye
ollama@TH-AI2:~$ ollama run codestral:latest
>>> Create a ruby script that counts from 1 to 100 and outputs to the console.
 Here's a simple Ruby script that counts from 1 to 100 and outputs to the console:

```ruby
(1..100).each do |number|
  puts number
end

The (1..100) creates a range of numbers from 1 to 100. The each method is then used to iterate over each number in the range. Finally, the puts method outputs the current number to the console.


 phi3:14b-medium requires 2 GPUs for its 27GB size and it too outputs garbage:

ollama@TH-AI2:~$ ollama run phi3:14b-medium-128k-instruct-f16

Hello how are you?
###############################

Send a message (/? for help)





### OS

Linux

### GPU

AMD

### CPU

AMD

### Ollama version

0.1.48

Originally created by @Speedway1 on GitHub (Jul 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5450 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? This is on AMD. I have 2 x Radeon 7900 XCX cards (24gb each). For models/memory use that only uses 1 GPU, everything works fine. As soon as both cards are required, the inference fails with garbage. As seen in this output: ``` ollama@TH-AI2:~$ ollama list NAME ID SIZE MODIFIED deepseek-coder-v2:latest 8577f96d693e 8.9 GB 10 days ago codestral:latest fcc0019dcee9 12 GB 11 days ago qwen2:latest e0d4e1163c58 4.4 GB 11 days ago command-r:latest b8cdfff0263c 20 GB 11 days ago mxbai-embed-large:latest 468836162de7 669 MB 11 days ago llama3:70b 786f3184aec0 39 GB 11 days ago phi3:14b-medium-128k-instruct-f16 e89861c3ba63 27 GB 11 days ago ollama@TH-AI2:~$ ollama run command-r:latest >>> Hello how are you? ??????????????????????????????? >>> /bye ``` Codetral is only 12 GB and runs in 1GPU, it works fine: ``` ollama@TH-AI2:~$ ollama run command-r:latest >>> Hello how are you? ??????????????????????????????? >>> /bye ollama@TH-AI2:~$ ollama run codestral:latest >>> Create a ruby script that counts from 1 to 100 and outputs to the console. Here's a simple Ruby script that counts from 1 to 100 and outputs to the console: ```ruby (1..100).each do |number| puts number end ``` The `(1..100)` creates a range of numbers from 1 to 100. The `each` method is then used to iterate over each number in the range. Finally, the `puts` method outputs the current number to the console. ``` phi3:14b-medium requires 2 GPUs for its 27GB size and it too outputs garbage: ``` ollama@TH-AI2:~$ ollama run phi3:14b-medium-128k-instruct-f16 >>> Hello how are you? ############################### >>> Send a message (/? for help) ``` ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version 0.1.48

GiteaMirror added the gpu amd bug labels 2026-05-03 21:18:04 -05:00

GiteaMirror closed this issue

2026-05-03 21:18:06 -05:00

GiteaMirror commented

2026-05-03 21:18:07 -05:00

@Speedway1 commented on GitHub (Jul 3, 2024):

Here is the setup:

root@TH-AI2:~# rocm-smi

============================================ ROCm System Management Interface ============================================
====================================================== Concise Info ======================================================
Device  Node  IDs              Temp    Power    Partitions          SCLK   MCLK     Fan  Perf  PwrCap       VRAM%  GPU%  
              (DID,     GUID)  (Edge)  (Avg)    (Mem, Compute, ID)                                                       
==========================================================================================================================
0       1     0x744c,   55924  44.0°C  26.0W    N/A, N/A, 0         62Mhz  96Mhz    0%   auto  327.0W       94%    1%    
1       2     0x744c,   27211  43.0°C  19.0W    N/A, N/A, 0         63Mhz  96Mhz    0%   auto  327.0W       92%    2%    
2       3     0x164e,   33198  30.0°C  21.086W  N/A, N/A, 0         None   1800Mhz  0%   auto  Unsupported  15%    0%    
==========================================================================================================================
================================================== End of ROCm SMI Log ===================================================

@Speedway1 commented on GitHub (Jul 3, 2024): Here is the setup: root@TH-AI2:~# rocm-smi ``` ============================================ ROCm System Management Interface ============================================ ====================================================== Concise Info ====================================================== Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% (DID, GUID) (Edge) (Avg) (Mem, Compute, ID) ========================================================================================================================== 0 1 0x744c, 55924 44.0°C 26.0W N/A, N/A, 0 62Mhz 96Mhz 0% auto 327.0W 94% 1% 1 2 0x744c, 27211 43.0°C 19.0W N/A, N/A, 0 63Mhz 96Mhz 0% auto 327.0W 92% 2% 2 3 0x164e, 33198 30.0°C 21.086W N/A, N/A, 0 None 1800Mhz 0% auto Unsupported 15% 0% ========================================================================================================================== ================================================== End of ROCm SMI Log =================================================== ```

GiteaMirror commented

2026-05-03 21:18:08 -05:00

@Speedway1 commented on GitHub (Jul 3, 2024):

Just to add extra information, it looks like there's a memory issue when running across 2 AMD GPUs. Here is the log output when it fails:

Jul  4 00:37:27 TH-AI2 ollama[238317]: DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=4 tid="124200820454208" timestamp=1720049847
Jul  4 00:37:27 TH-AI2 kernel: [82387.667170] amd_iommu_report_page_fault: 102 callbacks suppressed
Jul  4 00:37:27 TH-AI2 kernel: [82387.667174] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081000000 flags=0x0020]
Jul  4 00:37:27 TH-AI2 kernel: [82387.667190] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081000300 flags=0x0020]
Jul  4 00:37:27 TH-AI2 kernel: [82387.667203] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081000b00 flags=0x0020]
Jul  4 00:37:27 TH-AI2 kernel: [82387.667215] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081001000 flags=0x0020]
Jul  4 00:37:27 TH-AI2 kernel: [82387.667226] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081000c00 flags=0x0020]
Jul  4 00:37:27 TH-AI2 kernel: [82387.667237] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081001300 flags=0x0020]
Jul  4 00:37:27 TH-AI2 kernel: [82387.667249] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081002900 flags=0x0020]
Jul  4 00:37:27 TH-AI2 kernel: [82387.667261] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081001400 flags=0x0020]
Jul  4 00:37:27 TH-AI2 kernel: [82387.667272] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081002400 flags=0x0020]
Jul  4 00:37:27 TH-AI2 kernel: [82387.667283] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081001f00 flags=0x0020]
Jul  4 00:37:28 TH-AI2 kernel: [82387.670406] AMD-Vi: IOMMU event log overflow

And here is the loading sequence:

Jul  4 00:36:48 TH-AI2 ollama[1488]: ggml_cuda_init: CUDA_USE_TENSOR_CORES: yes
Jul  4 00:36:48 TH-AI2 ollama[1488]: ggml_cuda_init: found 2 ROCm devices:
Jul  4 00:36:48 TH-AI2 ollama[1488]:   Device 0: Radeon RX 7900 XTX, compute capability 11.0, VMM: no
Jul  4 00:36:48 TH-AI2 ollama[1488]:   Device 1: Radeon RX 7900 XTX, compute capability 11.0, VMM: no
Jul  4 00:36:48 TH-AI2 ollama[1488]: llm_load_tensors: ggml ctx size =    0.51 MiB
Jul  4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors: offloading 40 repeating layers to GPU
Jul  4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors: offloading non-repeating layers to GPU
Jul  4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors: offloaded 41/41 layers to GPU
Jul  4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors:      ROCm0 buffer size =  9261.66 MiB
Jul  4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors:      ROCm1 buffer size = 10020.25 MiB
Jul  4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors:        CPU buffer size =  1640.62 MiB
Jul  4 00:36:49 TH-AI2 ollama[1488]: time=2024-07-04T00:36:49.931+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.04"
Jul  4 00:36:50 TH-AI2 ollama[1488]: time=2024-07-04T00:36:50.182+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.09"
Jul  4 00:36:50 TH-AI2 ollama[1488]: time=2024-07-04T00:36:50.433+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.14"
Jul  4 00:36:50 TH-AI2 ollama[1488]: time=2024-07-04T00:36:50.684+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.18"
Jul  4 00:36:50 TH-AI2 ollama[1488]: time=2024-07-04T00:36:50.935+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.24"
Jul  4 00:36:51 TH-AI2 ollama[1488]: time=2024-07-04T00:36:51.186+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.28"
Jul  4 00:36:51 TH-AI2 ollama[1488]: time=2024-07-04T00:36:51.437+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.33"
Jul  4 00:36:51 TH-AI2 ollama[1488]: time=2024-07-04T00:36:51.687+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.38"
Jul  4 00:36:51 TH-AI2 ollama[1488]: time=2024-07-04T00:36:51.938+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.43"
Jul  4 00:36:52 TH-AI2 ollama[1488]: time=2024-07-04T00:36:52.189+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.44"
Jul  4 00:36:52 TH-AI2 ollama[1488]: time=2024-07-04T00:36:52.442+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.55"
Jul  4 00:36:52 TH-AI2 ollama[1488]: time=2024-07-04T00:36:52.693+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.63"
Jul  4 00:36:52 TH-AI2 ollama[1488]: time=2024-07-04T00:36:52.943+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.72"
Jul  4 00:36:53 TH-AI2 ollama[1488]: time=2024-07-04T00:36:53.194+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.80"
Jul  4 00:36:53 TH-AI2 ollama[1488]: time=2024-07-04T00:36:53.445+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.88"
Jul  4 00:36:53 TH-AI2 ollama[1488]: time=2024-07-04T00:36:53.896+01:00 level=INFO source=server.go:594 msg="waiting for server to become available" status="llm server not responding"
Jul  4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: n_ctx      = 8192
Jul  4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: n_batch    = 512
Jul  4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: n_ubatch   = 512
Jul  4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: flash_attn = 0
Jul  4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: freq_base  = 8000000.0
Jul  4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: freq_scale = 1
Jul  4 00:36:54 TH-AI2 ollama[1488]: time=2024-07-04T00:36:54.314+01:00 level=INFO source=server.go:594 msg="waiting for server to become available" status="llm server loading model"
Jul  4 00:36:54 TH-AI2 ollama[1488]: time=2024-07-04T00:36:54.315+01:00 level=DEBUG source=server.go:605 msg="model load progress 1.00"
Jul  4 00:36:54 TH-AI2 ollama[1488]: time=2024-07-04T00:36:54.565+01:00 level=DEBUG source=server.go:608 msg="model load completed, waiting for server to become available" status="llm server loading model"
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_kv_cache_init:      ROCm0 KV buffer size =  5376.00 MiB
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_kv_cache_init:      ROCm1 KV buffer size =  4864.00 MiB
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: KV self size  = 10240.00 MiB, K (f16): 5120.00 MiB, V (f16): 5120.00 MiB
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model:  ROCm_Host  output buffer size =     4.03 MiB
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: pipeline parallelism enabled (n_copies=4)
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model:      ROCm0 compute buffer size =  1216.01 MiB
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model:      ROCm1 compute buffer size =  1216.02 MiB
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model:  ROCm_Host compute buffer size =    80.02 MiB
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: graph nodes  = 1208
Jul  4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: graph splits = 3
Jul  4 00:36:55 TH-AI2 kernel: [82355.181846] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080000100 flags=0x0020]
Jul  4 00:36:55 TH-AI2 kernel: [82355.181864] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080000400 flags=0x0020]
Jul  4 00:36:55 TH-AI2 kernel: [82355.181877] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080000c00 flags=0x0020]
Jul  4 00:36:55 TH-AI2 kernel: [82355.181889] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080001100 flags=0x0020]
Jul  4 00:36:55 TH-AI2 kernel: [82355.181901] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080001900 flags=0x0020]
Jul  4 00:36:55 TH-AI2 kernel: [82355.181913] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080000600 flags=0x0020]
Jul  4 00:36:55 TH-AI2 kernel: [82355.181924] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080001800 flags=0x0020]
Jul  4 00:36:55 TH-AI2 kernel: [82355.181936] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080002c00 flags=0x0020]
Jul  4 00:36:55 TH-AI2 kernel: [82355.181948] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080001300 flags=0x0020]
Jul  4 00:36:55 TH-AI2 kernel: [82355.181959] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080002300 flags=0x0020]
Jul  4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] initializing slots | n_slots=4 tid="124200820454208" timestamp=1720049816
Jul  4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] new slot | n_ctx_slot=2048 slot_id=0 tid="124200820454208" timestamp=1720049816
Jul  4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] new slot | n_ctx_slot=2048 slot_id=1 tid="124200820454208" timestamp=1720049816
Jul  4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] new slot | n_ctx_slot=2048 slot_id=2 tid="124200820454208" timestamp=1720049816
Jul  4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] new slot | n_ctx_slot=2048 slot_id=3 tid="124200820454208" timestamp=1720049816
Jul  4 00:36:56 TH-AI2 ollama[238317]: INFO [main] model loaded | tid="124200820454208" timestamp=1720049816

@Speedway1 commented on GitHub (Jul 3, 2024): Just to add extra information, it looks like there's a memory issue when running across 2 AMD GPUs. Here is the log output when it fails: ``` Jul 4 00:37:27 TH-AI2 ollama[238317]: DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=4 tid="124200820454208" timestamp=1720049847 Jul 4 00:37:27 TH-AI2 kernel: [82387.667170] amd_iommu_report_page_fault: 102 callbacks suppressed Jul 4 00:37:27 TH-AI2 kernel: [82387.667174] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081000000 flags=0x0020] Jul 4 00:37:27 TH-AI2 kernel: [82387.667190] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081000300 flags=0x0020] Jul 4 00:37:27 TH-AI2 kernel: [82387.667203] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081000b00 flags=0x0020] Jul 4 00:37:27 TH-AI2 kernel: [82387.667215] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081001000 flags=0x0020] Jul 4 00:37:27 TH-AI2 kernel: [82387.667226] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081000c00 flags=0x0020] Jul 4 00:37:27 TH-AI2 kernel: [82387.667237] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081001300 flags=0x0020] Jul 4 00:37:27 TH-AI2 kernel: [82387.667249] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081002900 flags=0x0020] Jul 4 00:37:27 TH-AI2 kernel: [82387.667261] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081001400 flags=0x0020] Jul 4 00:37:27 TH-AI2 kernel: [82387.667272] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081002400 flags=0x0020] Jul 4 00:37:27 TH-AI2 kernel: [82387.667283] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf081001f00 flags=0x0020] Jul 4 00:37:28 TH-AI2 kernel: [82387.670406] AMD-Vi: IOMMU event log overflow ``` And here is the loading sequence: ``` Jul 4 00:36:48 TH-AI2 ollama[1488]: ggml_cuda_init: CUDA_USE_TENSOR_CORES: yes Jul 4 00:36:48 TH-AI2 ollama[1488]: ggml_cuda_init: found 2 ROCm devices: Jul 4 00:36:48 TH-AI2 ollama[1488]: Device 0: Radeon RX 7900 XTX, compute capability 11.0, VMM: no Jul 4 00:36:48 TH-AI2 ollama[1488]: Device 1: Radeon RX 7900 XTX, compute capability 11.0, VMM: no Jul 4 00:36:48 TH-AI2 ollama[1488]: llm_load_tensors: ggml ctx size = 0.51 MiB Jul 4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors: offloading 40 repeating layers to GPU Jul 4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors: offloading non-repeating layers to GPU Jul 4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors: offloaded 41/41 layers to GPU Jul 4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors: ROCm0 buffer size = 9261.66 MiB Jul 4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors: ROCm1 buffer size = 10020.25 MiB Jul 4 00:36:49 TH-AI2 ollama[1488]: llm_load_tensors: CPU buffer size = 1640.62 MiB Jul 4 00:36:49 TH-AI2 ollama[1488]: time=2024-07-04T00:36:49.931+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.04" Jul 4 00:36:50 TH-AI2 ollama[1488]: time=2024-07-04T00:36:50.182+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.09" Jul 4 00:36:50 TH-AI2 ollama[1488]: time=2024-07-04T00:36:50.433+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.14" Jul 4 00:36:50 TH-AI2 ollama[1488]: time=2024-07-04T00:36:50.684+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.18" Jul 4 00:36:50 TH-AI2 ollama[1488]: time=2024-07-04T00:36:50.935+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.24" Jul 4 00:36:51 TH-AI2 ollama[1488]: time=2024-07-04T00:36:51.186+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.28" Jul 4 00:36:51 TH-AI2 ollama[1488]: time=2024-07-04T00:36:51.437+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.33" Jul 4 00:36:51 TH-AI2 ollama[1488]: time=2024-07-04T00:36:51.687+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.38" Jul 4 00:36:51 TH-AI2 ollama[1488]: time=2024-07-04T00:36:51.938+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.43" Jul 4 00:36:52 TH-AI2 ollama[1488]: time=2024-07-04T00:36:52.189+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.44" Jul 4 00:36:52 TH-AI2 ollama[1488]: time=2024-07-04T00:36:52.442+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.55" Jul 4 00:36:52 TH-AI2 ollama[1488]: time=2024-07-04T00:36:52.693+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.63" Jul 4 00:36:52 TH-AI2 ollama[1488]: time=2024-07-04T00:36:52.943+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.72" Jul 4 00:36:53 TH-AI2 ollama[1488]: time=2024-07-04T00:36:53.194+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.80" Jul 4 00:36:53 TH-AI2 ollama[1488]: time=2024-07-04T00:36:53.445+01:00 level=DEBUG source=server.go:605 msg="model load progress 0.88" Jul 4 00:36:53 TH-AI2 ollama[1488]: time=2024-07-04T00:36:53.896+01:00 level=INFO source=server.go:594 msg="waiting for server to become available" status="llm server not responding" Jul 4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: n_ctx = 8192 Jul 4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: n_batch = 512 Jul 4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: n_ubatch = 512 Jul 4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: flash_attn = 0 Jul 4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: freq_base = 8000000.0 Jul 4 00:36:54 TH-AI2 ollama[1488]: llama_new_context_with_model: freq_scale = 1 Jul 4 00:36:54 TH-AI2 ollama[1488]: time=2024-07-04T00:36:54.314+01:00 level=INFO source=server.go:594 msg="waiting for server to become available" status="llm server loading model" Jul 4 00:36:54 TH-AI2 ollama[1488]: time=2024-07-04T00:36:54.315+01:00 level=DEBUG source=server.go:605 msg="model load progress 1.00" Jul 4 00:36:54 TH-AI2 ollama[1488]: time=2024-07-04T00:36:54.565+01:00 level=DEBUG source=server.go:608 msg="model load completed, waiting for server to become available" status="llm server loading model" Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_kv_cache_init: ROCm0 KV buffer size = 5376.00 MiB Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_kv_cache_init: ROCm1 KV buffer size = 4864.00 MiB Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: KV self size = 10240.00 MiB, K (f16): 5120.00 MiB, V (f16): 5120.00 MiB Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: ROCm_Host output buffer size = 4.03 MiB Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: pipeline parallelism enabled (n_copies=4) Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: ROCm0 compute buffer size = 1216.01 MiB Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: ROCm1 compute buffer size = 1216.02 MiB Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: ROCm_Host compute buffer size = 80.02 MiB Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: graph nodes = 1208 Jul 4 00:36:55 TH-AI2 ollama[1488]: llama_new_context_with_model: graph splits = 3 Jul 4 00:36:55 TH-AI2 kernel: [82355.181846] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080000100 flags=0x0020] Jul 4 00:36:55 TH-AI2 kernel: [82355.181864] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080000400 flags=0x0020] Jul 4 00:36:55 TH-AI2 kernel: [82355.181877] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080000c00 flags=0x0020] Jul 4 00:36:55 TH-AI2 kernel: [82355.181889] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080001100 flags=0x0020] Jul 4 00:36:55 TH-AI2 kernel: [82355.181901] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080001900 flags=0x0020] Jul 4 00:36:55 TH-AI2 kernel: [82355.181913] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080000600 flags=0x0020] Jul 4 00:36:55 TH-AI2 kernel: [82355.181924] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080001800 flags=0x0020] Jul 4 00:36:55 TH-AI2 kernel: [82355.181936] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080002c00 flags=0x0020] Jul 4 00:36:55 TH-AI2 kernel: [82355.181948] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080001300 flags=0x0020] Jul 4 00:36:55 TH-AI2 kernel: [82355.181959] amdgpu 0000:0c:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0014 address=0xf080002300 flags=0x0020] Jul 4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] initializing slots | n_slots=4 tid="124200820454208" timestamp=1720049816 Jul 4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] new slot | n_ctx_slot=2048 slot_id=0 tid="124200820454208" timestamp=1720049816 Jul 4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] new slot | n_ctx_slot=2048 slot_id=1 tid="124200820454208" timestamp=1720049816 Jul 4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] new slot | n_ctx_slot=2048 slot_id=2 tid="124200820454208" timestamp=1720049816 Jul 4 00:36:56 TH-AI2 ollama[238317]: DEBUG [initialize] new slot | n_ctx_slot=2048 slot_id=3 tid="124200820454208" timestamp=1720049816 Jul 4 00:36:56 TH-AI2 ollama[238317]: INFO [main] model loaded | tid="124200820454208" timestamp=1720049816 ```

GiteaMirror commented

2026-05-03 21:18:09 -05:00

@eliranwong commented on GitHub (Jul 4, 2024):

Here is the setup:

root@TH-AI2:~# rocm-smi

============================================ ROCm System Management Interface ============================================
====================================================== Concise Info ======================================================
Device  Node  IDs              Temp    Power    Partitions          SCLK   MCLK     Fan  Perf  PwrCap       VRAM%  GPU%  
              (DID,     GUID)  (Edge)  (Avg)    (Mem, Compute, ID)                                                       
==========================================================================================================================
0       1     0x744c,   55924  44.0°C  26.0W    N/A, N/A, 0         62Mhz  96Mhz    0%   auto  327.0W       94%    1%    
1       2     0x744c,   27211  43.0°C  19.0W    N/A, N/A, 0         63Mhz  96Mhz    0%   auto  327.0W       92%    2%    
2       3     0x164e,   33198  30.0°C  21.086W  N/A, N/A, 0         None   1800Mhz  0%   auto  Unsupported  15%    0%    
==========================================================================================================================
================================================== End of ROCm SMI Log ===================================================

I am using dual AMD RX 7900 XTX too, but no issue at all, e.g.:

ubuntu@ai:~/eliran/ai$ ollama list
NAME                 	ID          	SIZE  	MODIFIED       
deepseek-v2:16b      	7c8c332f2df7	8.9 GB	21 minutes ago	
deepseek-coder-v2:16b	8577f96d693e	8.9 GB	36 minutes ago	
internlm2:7b         	5050e36678ab	4.5 GB	2 hours ago   	
gemma2:27b           	371038893ee3	15 GB 	2 hours ago   	
gemma2:9b            	c19987e1e6e2	5.4 GB	7 hours ago   	
codellama:7b-code    	fc84f39375bc	3.8 GB	2 days ago    	
codellama:7b-instruct	8fdf8f752f6e	3.8 GB	2 days ago    	
dbrx:132b            	36800d8d3a28	74 GB 	3 days ago    	
wizardlm2:latest     	c9b1aff820f2	4.1 GB	4 days ago    	
command-r-plus:latest	c9c6cc6d20c7	59 GB 	5 days ago    	
mistral:latest       	2ae6f6dd7a3d	4.1 GB	5 days ago    	
ubuntu@ai:~/eliran/ai$ ollama run command-r-plus

>>> How are you?
As an AI language model, I don't have feelings or emotions in the traditional sense. However, my purpose is to 
assist and provide helpful responses to your queries, so from that perspective, I'm doing well! How can I help you 
today?

========================================== ROCm System Management Interface ==========================================
==================================================== Concise Info ====================================================
Device  Node  IDs              Temp    Power   Partitions          SCLK     MCLK   Fan     Perf  PwrCap  VRAM%  GPU%  
              (DID,     GUID)  (Edge)  (Avg)   (Mem, Compute, ID)                                                     
======================================================================================================================
0       2     0x744c,   45048  54.0°C  122.0W  N/A, N/A, 0         3152Mhz  96Mhz  22.75%  auto  327.0W  90%    100%  
1       1     0x744c,   12575  48.0°C  124.0W  N/A, N/A, 0         3056Mhz  96Mhz  14.9%   auto  327.0W  99%    100%  
======================================================================================================================
================================================ End of ROCm SMI Log =================================================

May I ask two questions:

What ROCm version are you using? I am using ROCm 6.1.3, which officially extend support to 7000 series. My setup notes are recorded at https://github.com/eliranwong/MultiAMDGPU_AIDev_Ubuntu
From your rocm-smi output, I saw you have 3 devices instead of 2. Are you using iGPU, together with your dual discrete GPUs? The official documentation states that mixing use of iGPU and GPUs cause issues with ROCm, so better disable iGPU.

@eliranwong commented on GitHub (Jul 4, 2024): > Here is the setup: > > root@TH-AI2:~# rocm-smi > > ``` > ============================================ ROCm System Management Interface ============================================ > ====================================================== Concise Info ====================================================== > Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% > (DID, GUID) (Edge) (Avg) (Mem, Compute, ID) > ========================================================================================================================== > 0 1 0x744c, 55924 44.0°C 26.0W N/A, N/A, 0 62Mhz 96Mhz 0% auto 327.0W 94% 1% > 1 2 0x744c, 27211 43.0°C 19.0W N/A, N/A, 0 63Mhz 96Mhz 0% auto 327.0W 92% 2% > 2 3 0x164e, 33198 30.0°C 21.086W N/A, N/A, 0 None 1800Mhz 0% auto Unsupported 15% 0% > ========================================================================================================================== > ================================================== End of ROCm SMI Log =================================================== > ``` I am using dual AMD RX 7900 XTX too, but no issue at all, e.g.: ``` ubuntu@ai:~/eliran/ai$ ollama list NAME ID SIZE MODIFIED deepseek-v2:16b 7c8c332f2df7 8.9 GB 21 minutes ago deepseek-coder-v2:16b 8577f96d693e 8.9 GB 36 minutes ago internlm2:7b 5050e36678ab 4.5 GB 2 hours ago gemma2:27b 371038893ee3 15 GB 2 hours ago gemma2:9b c19987e1e6e2 5.4 GB 7 hours ago codellama:7b-code fc84f39375bc 3.8 GB 2 days ago codellama:7b-instruct 8fdf8f752f6e 3.8 GB 2 days ago dbrx:132b 36800d8d3a28 74 GB 3 days ago wizardlm2:latest c9b1aff820f2 4.1 GB 4 days ago command-r-plus:latest c9c6cc6d20c7 59 GB 5 days ago mistral:latest 2ae6f6dd7a3d 4.1 GB 5 days ago ubuntu@ai:~/eliran/ai$ ollama run command-r-plus >>> How are you? As an AI language model, I don't have feelings or emotions in the traditional sense. However, my purpose is to assist and provide helpful responses to your queries, so from that perspective, I'm doing well! How can I help you today? ========================================== ROCm System Management Interface ========================================== ==================================================== Concise Info ==================================================== Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% (DID, GUID) (Edge) (Avg) (Mem, Compute, ID) ====================================================================================================================== 0 2 0x744c, 45048 54.0°C 122.0W N/A, N/A, 0 3152Mhz 96Mhz 22.75% auto 327.0W 90% 100% 1 1 0x744c, 12575 48.0°C 124.0W N/A, N/A, 0 3056Mhz 96Mhz 14.9% auto 327.0W 99% 100% ====================================================================================================================== ================================================ End of ROCm SMI Log ================================================= ``` May I ask two questions: 1) What ROCm version are you using? I am using ROCm 6.1.3, which officially extend support to 7000 series. My setup notes are recorded at https://github.com/eliranwong/MultiAMDGPU_AIDev_Ubuntu 2) From your rocm-smi output, I saw you have 3 devices instead of 2. Are you using iGPU, together with your dual discrete GPUs? The official documentation states that mixing use of iGPU and GPUs cause issues with ROCm, so better disable iGPU.

Sign in to join this conversation.

Branches Tags

main

dhiltgen/ci

parth-launch-plan-gating

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#65443