[GH-ISSUE #2555] EOF error on /api/chat or /api/generate #27257

New Issue

GiteaMirror · 2026-04-22T04:26:12-05:00

GiteaMirror commented

2026-04-22 04:26:12 -05:00

Originally created by @saamerm on GitHub (Feb 17, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2555

Originally assigned to: @dhiltgen on GitHub.

Upon running ollama run dolphin-phi on a Linux (works fine on Mac), I get this error Error: Post "http://127.0.0.1:11434/api/chat": EOF.
It seems to have installed successfully too, but it just seems like there's some error in the starting of the server?
I tried to add a --v for a more verbose understanding of the issue but that didnt help
Any ideas what I can do to debug?

I have a feeling that the error is originating from the Chat function of api/client.go which gets called by loadModel in cmd/interactive.go which gets called by generateInteractive() in the same file which itself is called by the RunHandler in cmd/cmd.go.

Within that Chat() function, I'm guessing that the issue is coming from the stream()function in the same file, but I can't tell what line it might be originating from

Originally created by @saamerm on GitHub (Feb 17, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2555 Originally assigned to: @dhiltgen on GitHub. * Upon running `ollama run dolphin-phi` on a Linux (works fine on Mac), I get this error `Error: Post "http://127.0.0.1:11434/api/chat": EOF`. * It seems to have installed successfully too, but it just seems like there's some error in the starting of the server? * I tried to add a --v for a more verbose understanding of the issue but that didnt help * Any ideas what I can do to debug? I have a feeling that the error is originating from [the Chat function of api/client.go](https://github.com/ollama/ollama/blob/f9fd08040be10bf3d944b642dff86020474cede6/api/client.go#L227) which gets called by [loadModel in cmd/interactive.go](https://github.com/ollama/ollama/blob/f9fd08040be10bf3d944b642dff86020474cede6/cmd/interactive.go#L59) which gets called by `generateInteractive()` in the same file which itself is called by the [RunHandler in cmd/cmd.go](https://github.com/ollama/ollama/blob/f9fd08040be10bf3d944b642dff86020474cede6/cmd/cmd.go#L212). Within that Chat() function, I'm guessing that the issue is coming from the `stream()`function in the same file, but I can't tell what line it might be originating from

GiteaMirror closed this issue

2026-04-22 04:26:13 -05:00

GiteaMirror commented

2026-04-22 04:26:15 -05:00

@remy415 commented on GitHub (Feb 20, 2024):

"ollama -v" just prints the version information. If you want verbose output, export OLLAMA_DEBUG="1" is what you want.

Without logs, there isn't much to do since the message http://127.0.0.1:11434/api/chat: EOF just means the server had an issue. In my case, I was seeing that message when I was developing and had a segfault due to a typo. Try running it again with the above environment variable set and if you get the same issue, the more verbose log should help pinpoint.

@remy415 commented on GitHub (Feb 20, 2024): "ollama -v" just prints the version information. If you want verbose output, `export OLLAMA_DEBUG="1"` is what you want. Without logs, there isn't much to do since the message `http://127.0.0.1:11434/api/chat: EOF` just means the server had an issue. In my case, I was seeing that message when I was developing and had a segfault due to a typo. Try running it again with the above environment variable set and if you get the same issue, the more verbose log should help pinpoint.

GiteaMirror commented

2026-04-22 04:26:15 -05:00

@tincore commented on GitHub (Feb 21, 2024):

Thanks for all the hard work.

I'm running on 0.1.25 and this error just happened to me when trying to run the 'gemma' models.

ollama.log

Other models that I downloaded recently are working fine (including dolphin-phi)

@tincore commented on GitHub (Feb 21, 2024): Thanks for all the hard work. I'm running on 0.1.25 and this error just happened to me when trying to run the 'gemma' models. [ollama.log](https://github.com/ollama/ollama/files/14366287/ollama.log) Other models that I downloaded recently are working fine (including dolphin-phi)

GiteaMirror commented

2026-04-22 04:26:16 -05:00

@remy415 commented on GitHub (Feb 21, 2024):

Thanks for all the hard work.

I'm running on 0.1.25 and this error just happened to me when trying to run the 'gemma' models.

ollama.log

Other models that I downloaded recently are working fine (including dolphin-phi)

Could you set export OLLAMA_DEBUG="1" and run it again please?

Though if it's just for dolphin-phi, maybe the model was compiled incorrectly or in a new way that isn't quite supported

@remy415 commented on GitHub (Feb 21, 2024): > Thanks for all the hard work. > > I'm running on 0.1.25 and this error just happened to me when trying to run the 'gemma' models. > > [ollama.log](https://github.com/ollama/ollama/files/14366287/ollama.log) > > Other models that I downloaded recently are working fine (including dolphin-phi) Could you set `export OLLAMA_DEBUG="1"` and run it again please? Though if it's just for dolphin-phi, maybe the model was compiled incorrectly or in a new way that isn't quite supported

GiteaMirror commented

2026-04-22 04:26:17 -05:00

@tincore commented on GitHub (Feb 21, 2024):

Thanks for the quick reply.

Actually I've just realized that you released 0.1.26.

I've upgraded and now it's working fine ;)

@tincore commented on GitHub (Feb 21, 2024): Thanks for the quick reply. Actually I've just realized that you released 0.1.26. I've upgraded and now it's working fine ;)

GiteaMirror commented

2026-04-22 04:26:17 -05:00

@musab commented on GitHub (Feb 22, 2024):

[MacOS]
I closed the "Ollama" app from the Mac menu bar. Reopened it and after a minute or so, I had the option to "Update" from the menu bar icon. This fixed the issue OP is reporting.

@musab commented on GitHub (Feb 22, 2024): [MacOS] I closed the "Ollama" app from the Mac menu bar. Reopened it and after a minute or so, I had the option to "Update" from the menu bar icon. This fixed the issue OP is reporting.

GiteaMirror commented

2026-04-22 04:26:18 -05:00

@wszme commented on GitHub (Feb 22, 2024):

Have you solved this problem？
sudo ollama run gemma:7b
Error: Post "http://127.0.0.1:11434/api/chat": EOF

@wszme commented on GitHub (Feb 22, 2024): Have you solved this problem？ sudo ollama run gemma:7b Error: Post "http://127.0.0.1:11434/api/chat": EOF

GiteaMirror commented

2026-04-22 04:26:19 -05:00

@tincore commented on GitHub (Feb 22, 2024):

@wszme For me it was fixed after updating to latest version.

As a side note (there is another issue about this [https://github.com/ollama/ollama/issues/2650]) Gemma:7b is not running great atm.

@tincore commented on GitHub (Feb 22, 2024): @wszme For me it was fixed after updating to latest version. As a side note (there is another issue about this [https://github.com/ollama/ollama/issues/2650]) Gemma:7b is not running great atm.

GiteaMirror commented

2026-04-22 04:26:20 -05:00

@wszme commented on GitHub (Feb 22, 2024):

@tincore how to run at latest version ?

@wszme commented on GitHub (Feb 22, 2024): @tincore how to run at latest version ?

GiteaMirror commented

2026-04-22 04:26:21 -05:00

@jafarzzz commented on GitHub (Feb 22, 2024):

I fixed it by upgrading the ollama to 0.1.26. You wont be able to do it from the application. Uninstall the ollama and download the latest one from: https://ollama.com/

gemma:7b worked after this fix.

@jafarzzz commented on GitHub (Feb 22, 2024): I fixed it by upgrading the ollama to 0.1.26. You wont be able to do it from the application. Uninstall the ollama and download the latest one from: https://ollama.com/ gemma:7b worked after this fix.

GiteaMirror commented

2026-04-22 04:26:21 -05:00

@wszme commented on GitHub (Feb 22, 2024):

@jafarzzz so thanks, I have taken care of it through your method.

@wszme commented on GitHub (Feb 22, 2024): @jafarzzz so thanks, I have taken care of it through your method.

GiteaMirror commented

2026-04-22 04:26:21 -05:00

@andreaganduglia commented on GitHub (Feb 22, 2024):

ollama run gemma:2b
pulling manifest
...
verifying sha256 digest 
writing manifest 
removing any unused layers 
success 
Error: Post "http://127.0.0.1:11434/api/chat": EOF
$ ollama -v
ollama version is 0.1.25

I can confirm that version 0.1.26 resolves this issue.

@andreaganduglia commented on GitHub (Feb 22, 2024): ``` ollama run gemma:2b pulling manifest ... verifying sha256 digest writing manifest removing any unused layers success Error: Post "http://127.0.0.1:11434/api/chat": EOF $ ollama -v ollama version is 0.1.25 ``` I can confirm that version `0.1.26` resolves this issue.

GiteaMirror commented

2026-04-22 04:26:22 -05:00

@wszme commented on GitHub (Feb 22, 2024):

@andreaganduglia yes
$ollama -v
ollama version is 0.1.26

@wszme commented on GitHub (Feb 22, 2024): @andreaganduglia yes $ollama -v ollama version is 0.1.26

GiteaMirror commented

2026-04-22 04:26:22 -05:00

@Jamaludiin commented on GitHub (Feb 24, 2024):

Install the latest version of ollama
ollama version is 0.1.27
because the gemma was just released in ollama repo

@Jamaludiin commented on GitHub (Feb 24, 2024): Install the latest version of ollama ollama version is **0.1.27** because the gemma was just released in ollama repo

GiteaMirror commented

2026-04-22 04:26:23 -05:00

@ketsapiwiq commented on GitHub (Feb 24, 2024):

Hi!
I have the same error on 0.1.27 and no debug output sadly.
I'm on ROCm which may be linked? But I can use llama.cpp with CPU or with my ROCm GPU.

(venv) hadrien@tomate: OLLAMA_DEBUG="1" ollama run gemma:2b                                 130 ↵
Error: Post "http://127.0.0.1:11434/api/chat": EOF
(venv) hadrien@tomate: OLLAMA_DEBUG="1" ollama -v                                           130 ↵
ollama version is 0.1.27
(venv) hadrien@tomate: sudo systemctl status ollama        
● ollama.service - Ollama Service
     Loaded: loaded (/etc/systemd/system/ollama.service; disabled; preset: disabled)
     Active: active (running) since Sat 2024-02-24 11:06:27 CET; 2min 3s ago
   Main PID: 902381 (ollama)
      Tasks: 17 (limit: 57508)
     Memory: 475.2M (peak: 475.9M)
        CPU: 3.495s
     CGroup: /system.slice/ollama.service
             └─902381 /usr/local/bin/ollama serve

févr. 24 11:06:27 tomate ollama[902381]: time=2024-02-24T11:06:27.175+01:00 level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..."
févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.102+01:00 level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 rocm_v6 cpu cuda_v11 rocm_v5]"
févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.102+01:00 level=INFO source=gpu.go:94 msg="Detecting GPU type"
févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.102+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so"
févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.109+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.109+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so"
févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.109+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0]"
févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.112+01:00 level=INFO source=gpu.go:109 msg="Radeon GPU detected"
févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.112+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
févr. 24 11:06:34 tomate ollama[902381]: [GIN] 2024/02/24 - 11:06:34 | 200 |      49.238µs |       127.0.0.1 | GET      "/api/version"

@ketsapiwiq commented on GitHub (Feb 24, 2024): Hi! I have the same error on 0.1.27 and no debug output sadly. I'm on ROCm which may be linked? But I can use llama.cpp with CPU or with my ROCm GPU. ```bash (venv) hadrien@tomate: OLLAMA_DEBUG="1" ollama run gemma:2b 130 ↵ Error: Post "http://127.0.0.1:11434/api/chat": EOF (venv) hadrien@tomate: OLLAMA_DEBUG="1" ollama -v 130 ↵ ollama version is 0.1.27 (venv) hadrien@tomate: sudo systemctl status ollama ● ollama.service - Ollama Service Loaded: loaded (/etc/systemd/system/ollama.service; disabled; preset: disabled) Active: active (running) since Sat 2024-02-24 11:06:27 CET; 2min 3s ago Main PID: 902381 (ollama) Tasks: 17 (limit: 57508) Memory: 475.2M (peak: 475.9M) CPU: 3.495s CGroup: /system.slice/ollama.service └─902381 /usr/local/bin/ollama serve févr. 24 11:06:27 tomate ollama[902381]: time=2024-02-24T11:06:27.175+01:00 level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..." févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.102+01:00 level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 rocm_v6 cpu cuda_v11 rocm_v5]" févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.102+01:00 level=INFO source=gpu.go:94 msg="Detecting GPU type" févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.102+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so" févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.109+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []" févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.109+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so" févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.109+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0]" févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.112+01:00 level=INFO source=gpu.go:109 msg="Radeon GPU detected" févr. 24 11:06:29 tomate ollama[902381]: time=2024-02-24T11:06:29.112+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" févr. 24 11:06:34 tomate ollama[902381]: [GIN] 2024/02/24 - 11:06:34 | 200 | 49.238µs | 127.0.0.1 | GET "/api/version" ```

GiteaMirror commented

2026-04-22 04:26:23 -05:00

@Jamaludiin commented on GitHub (Feb 24, 2024):

@ketsapiwiq I am only using the default ollama run gemma not the ollama run gemma:2b . May be other steps can help.

@Jamaludiin commented on GitHub (Feb 24, 2024): @ketsapiwiq I am only using the default **ollama run gemma** not the **ollama run gemma:2b** . May be other steps can help.

GiteaMirror commented

2026-04-22 04:26:24 -05:00

@remy415 commented on GitHub (Feb 24, 2024):

@ketsapiwiq theres a different syntax to get debug enabled when running as a service.

First edit the ollama service file
sudo nano /etc/systemd/system/ollama.service
Add the environment variable to it

[Service]
(leave these options alone…)
Environment=“OLLAMA_DEBUG=“1”

Then restart the service. Daemon-reload has the service reload their configuration files.

sudo systemctl daemon-reload
sudo systemctl restart ollama.service

Now when you view your logs it will be with debug enabled. Note that if you are running ollama serve from the command line, you would do it the way you quoted
OLLAMA_DEBUG=“1” ollama serve
Or export the variable so it stays enabled on your current terminal session
export OLLAMA_DEBUG=“1”

@remy415 commented on GitHub (Feb 24, 2024): @ketsapiwiq theres a different syntax to get debug enabled when running as a service. First edit the ollama service file `sudo nano /etc/systemd/system/ollama.service` Add the environment variable to it ``` [Service] (leave these options alone…) Environment=“OLLAMA_DEBUG=“1” ``` Then restart the service. Daemon-reload has the service reload their configuration files. ``` sudo systemctl daemon-reload sudo systemctl restart ollama.service ``` Now when you view your logs it will be with debug enabled. Note that if you are running ollama serve from the command line, you would do it the way you quoted `OLLAMA_DEBUG=“1” ollama serve` Or export the variable so it stays enabled on your current terminal session `export OLLAMA_DEBUG=“1”`

GiteaMirror commented

2026-04-22 04:26:24 -05:00

@ketsapiwiq commented on GitHub (Feb 25, 2024):

Thank you!
Sorry I could have guessed / tried.
So here's what I get, a free (): invalid pointer.

I'm open to try to recompile ollama, normally I use options such as make HIP_VISIBLE_DEVICES="0" LLAMA_HIPBLAS=1 to get llama stuff working with my GPU (RX 6750 XT).

hadrien@tomate:~ »  OLLAMA_DEBUG=“1” ollama serve
time=2024-02-25T11:58:59.885+01:00 level=INFO source=images.go:710 msg="total blobs: 0"
time=2024-02-25T11:58:59.885+01:00 level=INFO source=images.go:717 msg="total unused blobs removed: 0"
time=2024-02-25T11:58:59.886+01:00 level=INFO source=routes.go:1019 msg="Listening on 127.0.0.1:11434 (version 0.1.27)"
time=2024-02-25T11:58:59.886+01:00 level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..."
time=2024-02-25T11:59:01.766+01:00 level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [rocm_v6 cpu cuda_v11 cpu_avx cpu_avx2 rocm_v5]"
time=2024-02-25T11:59:01.766+01:00 level=DEBUG source=payload_common.go:147 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY"
time=2024-02-25T11:59:01.766+01:00 level=INFO source=gpu.go:94 msg="Detecting GPU type"
time=2024-02-25T11:59:01.766+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so"
time=2024-02-25T11:59:01.766+01:00 level=DEBUG source=gpu.go:283 msg="gpu management search paths: [/usr/local/cuda/lib64/libnvidia-ml.so* /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so* /usr/lib/x86_64-linux-gnu/libnvidia-ml.so* /usr/lib/wsl/lib/libnvidia-ml.so* /usr/lib/wsl/drivers/*/libnvidia-ml.so* /opt/cuda/lib64/libnvidia-ml.so* /usr/lib*/libnvidia-ml.so* /usr/local/lib*/libnvidia-ml.so* /usr/lib/aarch64-linux-gnu/nvidia/current/libnvidia-ml.so* /usr/lib/aarch64-linux-gnu/libnvidia-ml.so* /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so* /home/hadrien/libnvidia-ml.so*]"
time=2024-02-25T11:59:01.781+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
time=2024-02-25T11:59:01.781+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so"
time=2024-02-25T11:59:01.781+01:00 level=DEBUG source=gpu.go:283 msg="gpu management search paths: [/opt/rocm*/lib*/librocm_smi64.so* /home/hadrien/librocm_smi64.so*]"
time=2024-02-25T11:59:01.782+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0]"
wiring rocm management library functions in /opt/rocm/lib/librocm_smi64.so.5.0
dlsym: rsmi_init
dlsym: rsmi_shut_down
dlsym: rsmi_dev_memory_total_get
dlsym: rsmi_dev_memory_usage_get
dlsym: rsmi_version_get
dlsym: rsmi_num_monitor_devices
dlsym: rsmi_dev_id_get
dlsym: rsmi_dev_name_get
dlsym: rsmi_dev_brand_get
dlsym: rsmi_dev_vendor_name_get
dlsym: rsmi_dev_vram_vendor_get
dlsym: rsmi_dev_serial_number_get
dlsym: rsmi_dev_subsystem_name_get
dlsym: rsmi_dev_vbios_version_get
time=2024-02-25T11:59:01.785+01:00 level=INFO source=gpu.go:109 msg="Radeon GPU detected"
time=2024-02-25T11:59:01.785+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-02-25T11:59:01.785+01:00 level=DEBUG source=gpu.go:158 msg="error looking up amd driver version: %s" !BADKEY="amdgpu file stat error: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-02-25T11:59:01.785+01:00 level=DEBUG source=amd.go:76 msg="malformed gfx_target_version 0"
discovered 1 ROCm GPU Devices
[0] ROCm device name: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT]
[0] ROCm brand: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT]
[0] ROCm vendor: Advanced Micro Devices, Inc. [AMD/ATI]
[0] ROCm VRAM vendor: samsung
rsmi_dev_serial_number_get failed: 2
[0] ROCm subsystem name: 0xe36
[0] ROCm vbios version: 113-D5121100-101
[0] ROCm totalMem 12868124672
[0] ROCm usedMem 2378190848
time=2024-02-25T11:59:01.787+01:00 level=DEBUG source=gpu.go:254 msg="rocm detected 1 devices with 8979M available memory"
[GIN] 2024/02/25 - 11:59:15 | 200 |      31.218µs |       127.0.0.1 | HEAD     "/"
[GIN] 2024/02/25 - 11:59:15 | 404 |      67.117µs |       127.0.0.1 | POST     "/api/show"
time=2024-02-25T11:59:17.521+01:00 level=INFO source=download.go:136 msg="downloading c1864a5eb193 in 17 100 MB part(s)"
time=2024-02-25T11:59:37.440+01:00 level=INFO source=download.go:136 msg="downloading 097a36493f71 in 1 8.4 KB part(s)"
time=2024-02-25T11:59:39.403+01:00 level=INFO source=download.go:136 msg="downloading 109037bec39c in 1 136 B part(s)"
time=2024-02-25T11:59:42.447+01:00 level=INFO source=download.go:136 msg="downloading 22a838ceb7fb in 1 84 B part(s)"
time=2024-02-25T11:59:44.331+01:00 level=INFO source=download.go:136 msg="downloading 887433b89a90 in 1 483 B part(s)"
[GIN] 2024/02/25 - 11:59:46 | 200 | 30.349096058s |       127.0.0.1 | POST     "/api/pull"
[GIN] 2024/02/25 - 11:59:46 | 200 |     369.527µs |       127.0.0.1 | POST     "/api/show"
[GIN] 2024/02/25 - 11:59:46 | 200 |     197.788µs |       127.0.0.1 | POST     "/api/show"
time=2024-02-25T11:59:47.115+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-02-25T11:59:47.115+01:00 level=DEBUG source=gpu.go:158 msg="error looking up amd driver version: %s" !BADKEY="amdgpu file stat error: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-02-25T11:59:47.115+01:00 level=DEBUG source=amd.go:76 msg="malformed gfx_target_version 0"
discovered 1 ROCm GPU Devices
[0] ROCm device name: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT]
[0] ROCm brand: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT]
[0] ROCm vendor: Advanced Micro Devices, Inc. [AMD/ATI]
[0] ROCm VRAM vendor: samsung
rsmi_dev_serial_number_get failed: 2
[0] ROCm subsystem name: 0xe36
[0] ROCm vbios version: 113-D5121100-101
[0] ROCm totalMem 12868124672
[0] ROCm usedMem 2397560832
time=2024-02-25T11:59:47.126+01:00 level=DEBUG source=gpu.go:254 msg="rocm detected 1 devices with 8961M available memory"
time=2024-02-25T11:59:47.126+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-02-25T11:59:47.126+01:00 level=DEBUG source=gpu.go:158 msg="error looking up amd driver version: %s" !BADKEY="amdgpu file stat error: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-02-25T11:59:47.126+01:00 level=DEBUG source=amd.go:76 msg="malformed gfx_target_version 0"
discovered 1 ROCm GPU Devices
[0] ROCm device name: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT]
[0] ROCm brand: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT]
[0] ROCm vendor: Advanced Micro Devices, Inc. [AMD/ATI]
[0] ROCm VRAM vendor: samsung
rsmi_dev_serial_number_get failed: 2
[0] ROCm subsystem name: 0xe36
[0] ROCm vbios version: 113-D5121100-101
[0] ROCm totalMem 12868124672
[0] ROCm usedMem 2397560832
time=2024-02-25T11:59:47.128+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-02-25T11:59:47.128+01:00 level=DEBUG source=payload_common.go:93 msg="ordered list of LLM libraries to try [/tmp/ollama3414513364/rocm_v5/libext_server.so /tmp/ollama3414513364/rocm_v6/libext_server.so /tmp/ollama3414513364/cpu_avx2/libext_server.so]"
loading library /tmp/ollama3414513364/rocm_v5/libext_server.so
time=2024-02-25T11:59:47.307+01:00 level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama3414513364/rocm_v5/libext_server.so"
time=2024-02-25T11:59:47.307+01:00 level=INFO source=dyn_ext_server.go:150 msg="Initializing llama server"
[1708858787] system info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | 
[1708858787] Performing pre-initialization of GPU
free(): invalid pointer
[1]    1510443 IOT instruction (core dumped)  OLLAMA_DEBUG=“1” ollama serve

@ketsapiwiq commented on GitHub (Feb 25, 2024): Thank you! Sorry I could have guessed / tried. So here's what I get, a `free (): invalid pointer`. I'm open to try to recompile ollama, normally I use options such as `make HIP_VISIBLE_DEVICES="0" LLAMA_HIPBLAS=1` to get llama stuff working with my GPU (RX 6750 XT). ``` hadrien@tomate:~ » OLLAMA_DEBUG=“1” ollama serve time=2024-02-25T11:58:59.885+01:00 level=INFO source=images.go:710 msg="total blobs: 0" time=2024-02-25T11:58:59.885+01:00 level=INFO source=images.go:717 msg="total unused blobs removed: 0" time=2024-02-25T11:58:59.886+01:00 level=INFO source=routes.go:1019 msg="Listening on 127.0.0.1:11434 (version 0.1.27)" time=2024-02-25T11:58:59.886+01:00 level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..." time=2024-02-25T11:59:01.766+01:00 level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [rocm_v6 cpu cuda_v11 cpu_avx cpu_avx2 rocm_v5]" time=2024-02-25T11:59:01.766+01:00 level=DEBUG source=payload_common.go:147 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY" time=2024-02-25T11:59:01.766+01:00 level=INFO source=gpu.go:94 msg="Detecting GPU type" time=2024-02-25T11:59:01.766+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so" time=2024-02-25T11:59:01.766+01:00 level=DEBUG source=gpu.go:283 msg="gpu management search paths: [/usr/local/cuda/lib64/libnvidia-ml.so* /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so* /usr/lib/x86_64-linux-gnu/libnvidia-ml.so* /usr/lib/wsl/lib/libnvidia-ml.so* /usr/lib/wsl/drivers/*/libnvidia-ml.so* /opt/cuda/lib64/libnvidia-ml.so* /usr/lib*/libnvidia-ml.so* /usr/local/lib*/libnvidia-ml.so* /usr/lib/aarch64-linux-gnu/nvidia/current/libnvidia-ml.so* /usr/lib/aarch64-linux-gnu/libnvidia-ml.so* /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so* /home/hadrien/libnvidia-ml.so*]" time=2024-02-25T11:59:01.781+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []" time=2024-02-25T11:59:01.781+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so" time=2024-02-25T11:59:01.781+01:00 level=DEBUG source=gpu.go:283 msg="gpu management search paths: [/opt/rocm*/lib*/librocm_smi64.so* /home/hadrien/librocm_smi64.so*]" time=2024-02-25T11:59:01.782+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/opt/rocm/lib/librocm_smi64.so.5.0]" wiring rocm management library functions in /opt/rocm/lib/librocm_smi64.so.5.0 dlsym: rsmi_init dlsym: rsmi_shut_down dlsym: rsmi_dev_memory_total_get dlsym: rsmi_dev_memory_usage_get dlsym: rsmi_version_get dlsym: rsmi_num_monitor_devices dlsym: rsmi_dev_id_get dlsym: rsmi_dev_name_get dlsym: rsmi_dev_brand_get dlsym: rsmi_dev_vendor_name_get dlsym: rsmi_dev_vram_vendor_get dlsym: rsmi_dev_serial_number_get dlsym: rsmi_dev_subsystem_name_get dlsym: rsmi_dev_vbios_version_get time=2024-02-25T11:59:01.785+01:00 level=INFO source=gpu.go:109 msg="Radeon GPU detected" time=2024-02-25T11:59:01.785+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-02-25T11:59:01.785+01:00 level=DEBUG source=gpu.go:158 msg="error looking up amd driver version: %s" !BADKEY="amdgpu file stat error: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory" time=2024-02-25T11:59:01.785+01:00 level=DEBUG source=amd.go:76 msg="malformed gfx_target_version 0" discovered 1 ROCm GPU Devices [0] ROCm device name: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] [0] ROCm brand: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] [0] ROCm vendor: Advanced Micro Devices, Inc. [AMD/ATI] [0] ROCm VRAM vendor: samsung rsmi_dev_serial_number_get failed: 2 [0] ROCm subsystem name: 0xe36 [0] ROCm vbios version: 113-D5121100-101 [0] ROCm totalMem 12868124672 [0] ROCm usedMem 2378190848 time=2024-02-25T11:59:01.787+01:00 level=DEBUG source=gpu.go:254 msg="rocm detected 1 devices with 8979M available memory" [GIN] 2024/02/25 - 11:59:15 | 200 | 31.218µs | 127.0.0.1 | HEAD "/" [GIN] 2024/02/25 - 11:59:15 | 404 | 67.117µs | 127.0.0.1 | POST "/api/show" time=2024-02-25T11:59:17.521+01:00 level=INFO source=download.go:136 msg="downloading c1864a5eb193 in 17 100 MB part(s)" time=2024-02-25T11:59:37.440+01:00 level=INFO source=download.go:136 msg="downloading 097a36493f71 in 1 8.4 KB part(s)" time=2024-02-25T11:59:39.403+01:00 level=INFO source=download.go:136 msg="downloading 109037bec39c in 1 136 B part(s)" time=2024-02-25T11:59:42.447+01:00 level=INFO source=download.go:136 msg="downloading 22a838ceb7fb in 1 84 B part(s)" time=2024-02-25T11:59:44.331+01:00 level=INFO source=download.go:136 msg="downloading 887433b89a90 in 1 483 B part(s)" [GIN] 2024/02/25 - 11:59:46 | 200 | 30.349096058s | 127.0.0.1 | POST "/api/pull" [GIN] 2024/02/25 - 11:59:46 | 200 | 369.527µs | 127.0.0.1 | POST "/api/show" [GIN] 2024/02/25 - 11:59:46 | 200 | 197.788µs | 127.0.0.1 | POST "/api/show" time=2024-02-25T11:59:47.115+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-02-25T11:59:47.115+01:00 level=DEBUG source=gpu.go:158 msg="error looking up amd driver version: %s" !BADKEY="amdgpu file stat error: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory" time=2024-02-25T11:59:47.115+01:00 level=DEBUG source=amd.go:76 msg="malformed gfx_target_version 0" discovered 1 ROCm GPU Devices [0] ROCm device name: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] [0] ROCm brand: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] [0] ROCm vendor: Advanced Micro Devices, Inc. [AMD/ATI] [0] ROCm VRAM vendor: samsung rsmi_dev_serial_number_get failed: 2 [0] ROCm subsystem name: 0xe36 [0] ROCm vbios version: 113-D5121100-101 [0] ROCm totalMem 12868124672 [0] ROCm usedMem 2397560832 time=2024-02-25T11:59:47.126+01:00 level=DEBUG source=gpu.go:254 msg="rocm detected 1 devices with 8961M available memory" time=2024-02-25T11:59:47.126+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-02-25T11:59:47.126+01:00 level=DEBUG source=gpu.go:158 msg="error looking up amd driver version: %s" !BADKEY="amdgpu file stat error: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory" time=2024-02-25T11:59:47.126+01:00 level=DEBUG source=amd.go:76 msg="malformed gfx_target_version 0" discovered 1 ROCm GPU Devices [0] ROCm device name: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] [0] ROCm brand: Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] [0] ROCm vendor: Advanced Micro Devices, Inc. [AMD/ATI] [0] ROCm VRAM vendor: samsung rsmi_dev_serial_number_get failed: 2 [0] ROCm subsystem name: 0xe36 [0] ROCm vbios version: 113-D5121100-101 [0] ROCm totalMem 12868124672 [0] ROCm usedMem 2397560832 time=2024-02-25T11:59:47.128+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" time=2024-02-25T11:59:47.128+01:00 level=DEBUG source=payload_common.go:93 msg="ordered list of LLM libraries to try [/tmp/ollama3414513364/rocm_v5/libext_server.so /tmp/ollama3414513364/rocm_v6/libext_server.so /tmp/ollama3414513364/cpu_avx2/libext_server.so]" loading library /tmp/ollama3414513364/rocm_v5/libext_server.so time=2024-02-25T11:59:47.307+01:00 level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama3414513364/rocm_v5/libext_server.so" time=2024-02-25T11:59:47.307+01:00 level=INFO source=dyn_ext_server.go:150 msg="Initializing llama server" [1708858787] system info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | [1708858787] Performing pre-initialization of GPU free(): invalid pointer [1] 1510443 IOT instruction (core dumped) OLLAMA_DEBUG=“1” ollama serve ```

GiteaMirror commented

2026-04-22 04:26:25 -05:00

@remy415 commented on GitHub (Feb 25, 2024):

I don’t know a lot about amd gpus, I haven’t used one in a very long time. But I see two errant things going on: it looks as though amd.go isn’t finding the expected item at /sys/module/gpu/version. Then something in the background C code is attempting to free a null pointer, possibly a pointer assigned by that item.

Perhaps something was incorrect during the installation process or an incorrect version. You could always try to reinstall the AMD driver https://www.amd.com/en/support/kb/faq/amdgpu-installation. Again, have to stress I don’t know much about AMD gpus so I’m just kinda brainstorming ideas. If you installed the Open version, try installing the Pro version and vice versa.

Sorry, wish I had more ideas for you. I know ollama is in the process of changing some of the AMD driver loading, but I don’t have an AMD gpu to test anything so I can’t really debug it. They may have a bug fix coming soon

@remy415 commented on GitHub (Feb 25, 2024): I don’t know a lot about amd gpus, I haven’t used one in a very long time. But I see two errant things going on: it looks as though amd.go isn’t finding the expected item at /sys/module/gpu/version. Then something in the background C code is attempting to free a null pointer, possibly a pointer assigned by that item. Perhaps something was incorrect during the installation process or an incorrect version. You could always try to reinstall the AMD driver [https://www.amd.com/en/support/kb/faq/amdgpu-installation](here). Again, have to stress I don’t know much about AMD gpus so I’m just kinda brainstorming ideas. If you installed the Open version, try installing the Pro version and vice versa. Sorry, wish I had more ideas for you. I know ollama is in the process of changing some of the AMD driver loading, but I don’t have an AMD gpu to test anything so I can’t really debug it. They may have a bug fix coming soon

GiteaMirror commented

2026-04-22 04:26:25 -05:00

@remy415 commented on GitHub (Feb 25, 2024):

@ketsapiwiq I have some theories. I've tried to implement a fix in a fork of Ollama, if you'd like to test it out I can help you with that.
First, ensure you have the proper go runtime, cmake, and gcc compilers.

cmake version 3.24 or higher
go version 1.21 or higher
gcc version 11.4.0 or higher

Then:

Clone the fork from here:
git clone https://github.com/remy415/ollama.git
cd ollama
go generate ./... && go build .
OLLAMA_DEBUG=1 ./ollama serve

If I'm right, then I'll submit a bug fix to clean that up.

@remy415 commented on GitHub (Feb 25, 2024): @ketsapiwiq I have some theories. I've tried to implement a fix in a fork of Ollama, if you'd like to test it out I can help you with that. First, ensure you have the proper go runtime, cmake, and gcc compilers. ``` cmake version 3.24 or higher go version 1.21 or higher gcc version 11.4.0 or higher ``` Then: 1. Clone the fork from here: `git clone https://github.com/remy415/ollama.git` 2. `cd ollama` 3. `go generate ./... && go build .` 4. `OLLAMA_DEBUG=1 ./ollama serve` If I'm right, then I'll submit a bug fix to clean that up.

GiteaMirror commented

2026-04-22 04:26:25 -05:00

@remy415 commented on GitHub (Feb 25, 2024):

@ketsapiwiq Nevermind, I played around with it some more and determined that my fix isn't relevant to this problem. I'll poke around some more and update if I find anything.

@remy415 commented on GitHub (Feb 25, 2024): @ketsapiwiq Nevermind, I played around with it some more and determined that my fix isn't relevant to this problem. I'll poke around some more and update if I find anything.

GiteaMirror commented

2026-04-22 04:26:26 -05:00

@remy415 commented on GitHub (Feb 25, 2024):

The only way I was able to replicate the error message free(): invalid pointer in C was to call free(test); with char test; (int, char, float...anything that isn't a pointer assignment). Even if I did free(&test), it still reported invalid pointer.

tegra@ok3d-1:~/ok3d/testing_go$ /usr/bin/gcc -fdiagnostics-color=always -g /home/tegra/ok3d/testing_go/header.c -o /home/tegra/ok3d/testing_go/header
/home/tegra/ok3d/testing_go/header.c: In function ‘main’:
/home/tegra/ok3d/testing_go/header.c:20:10: warning: passing argument 1 of ‘free’ makes pointer from integer without a cast [-Wint-conversion]
   20 |     free(testp);
      |          ^~~~~
      |          |
      |          char
In file included from /home/tegra/ok3d/testing_go/header.c:1:
/usr/include/stdlib.h:565:25: note: expected ‘void *’ but argument is of type ‘char’
  565 | extern void free (void *__ptr) __THROW;
      |                   ~~~~~~^~~~~
tegra@ok3d-1:~/ok3d/testing_go$ ./header 
Segmentation fault (core dumped)
tegra@ok3d-1:~/ok3d/testing_go$ /usr/bin/gcc -fdiagnostics-color=always -g /home/tegra/ok3d/testing_go/header.c -o /home/tegra/ok3d/testing_go/header
/home/tegra/ok3d/testing_go/header.c: In function ‘main’:
/home/tegra/ok3d/testing_go/header.c:20:5: warning: attempt to free a non-heap object ‘testp’ [-Wfree-nonheap-object]
   20 |     free(&testp);
      |     ^~~~~~~~~~~~
tegra@ok3d-1:~/ok3d/testing_go$ ./header 
free(): invalid pointer
Aborted (core dumped)
tegra@ok3d-1:~/ok3d/testing_go$

If I initialize it with malloc, even if the pointer is null it doesn't error.

In Go, I was unable to replicate the error, even by calling cgo calls. The Go compiler will properly identify that it is not an "unsafe pointer" (i.e. something that has been heap-allocated with malloc(), calloc(), etc). I formatted the example code to be almost exactly the same as the code seen in dyn_ext_server.go:

func testFunc(len C.size_t) C.test_struct_t {
	var resp C.test_struct_t
	resp.msg_len = len
	bytes := make([]byte, len)
	resp.msg = (*C.char)(C.CBytes(bytes))

	if resp.msg == nil {
		fmt.Println("resp.msg nil")
	}
	return resp
}

func main() {
	resp := testFunc(128)
	var test_resp C.test_string

	defer C.free(unsafe.Pointer(resp.msg))
	defer C.free(unsafe.Pointer(test_resp))

	fmt.Println("test_resp:", test_resp)
	fmt.Println("resp.msg:", C.GoString(resp.msg))
}

tegra@ok3d-1:~/ok3d/testing_go$ go run main.go
# command-line-arguments
./main.go:35:30: cannot convert test_resp (variable of type _Ctype_test_string) to type unsafe.Pointer
tegra@ok3d-1:~/ok3d/testing_go$

I'm still relatively new to C/C++/Golang programming, so I may have missed something -- if so, someone please correct me. But as it stands, my logical assumption is that somewhere in the C-code being imported into ollama is calling a free() on a memory address that isn't heap-allocated memory.

Side note: can anyone confirm this error still presents in the updated Ollama? I'm not finding it with the NVidia GPU systems I've run it on.

@remy415 commented on GitHub (Feb 25, 2024): The only way I was able to replicate the error message `free(): invalid pointer` in C was to call `free(test);` with `char test;` (int, char, float...anything that isn't a pointer assignment). Even if I did `free(&test)`, it still reported invalid pointer. ``` tegra@ok3d-1:~/ok3d/testing_go$ /usr/bin/gcc -fdiagnostics-color=always -g /home/tegra/ok3d/testing_go/header.c -o /home/tegra/ok3d/testing_go/header /home/tegra/ok3d/testing_go/header.c: In function ‘main’: /home/tegra/ok3d/testing_go/header.c:20:10: warning: passing argument 1 of ‘free’ makes pointer from integer without a cast [-Wint-conversion] 20 | free(testp); | ^~~~~ | | | char In file included from /home/tegra/ok3d/testing_go/header.c:1: /usr/include/stdlib.h:565:25: note: expected ‘void *’ but argument is of type ‘char’ 565 | extern void free (void *__ptr) __THROW; | ~~~~~~^~~~~ tegra@ok3d-1:~/ok3d/testing_go$ ./header Segmentation fault (core dumped) tegra@ok3d-1:~/ok3d/testing_go$ /usr/bin/gcc -fdiagnostics-color=always -g /home/tegra/ok3d/testing_go/header.c -o /home/tegra/ok3d/testing_go/header /home/tegra/ok3d/testing_go/header.c: In function ‘main’: /home/tegra/ok3d/testing_go/header.c:20:5: warning: attempt to free a non-heap object ‘testp’ [-Wfree-nonheap-object] 20 | free(&testp); | ^~~~~~~~~~~~ tegra@ok3d-1:~/ok3d/testing_go$ ./header free(): invalid pointer Aborted (core dumped) tegra@ok3d-1:~/ok3d/testing_go$ ``` If I initialize it with malloc, even if the pointer is null it doesn't error. In Go, I was unable to replicate the error, even by calling cgo calls. The Go compiler will properly identify that it is not an "unsafe pointer" (i.e. something that has been heap-allocated with `malloc()`, `calloc()`, etc). I formatted the example code to be almost exactly the same as the code seen in dyn_ext_server.go: ``` func testFunc(len C.size_t) C.test_struct_t { var resp C.test_struct_t resp.msg_len = len bytes := make([]byte, len) resp.msg = (*C.char)(C.CBytes(bytes)) if resp.msg == nil { fmt.Println("resp.msg nil") } return resp } func main() { resp := testFunc(128) var test_resp C.test_string defer C.free(unsafe.Pointer(resp.msg)) defer C.free(unsafe.Pointer(test_resp)) fmt.Println("test_resp:", test_resp) fmt.Println("resp.msg:", C.GoString(resp.msg)) } ``` ``` tegra@ok3d-1:~/ok3d/testing_go$ go run main.go # command-line-arguments ./main.go:35:30: cannot convert test_resp (variable of type _Ctype_test_string) to type unsafe.Pointer tegra@ok3d-1:~/ok3d/testing_go$ ``` I'm still relatively new to C/C++/Golang programming, so I may have missed something -- if so, someone please correct me. But as it stands, my logical assumption is that somewhere in the C-code being imported into ollama is calling a `free()` on a memory address that isn't heap-allocated memory. Side note: can anyone confirm this error still presents in the updated Ollama? I'm not finding it with the NVidia GPU systems I've run it on.

GiteaMirror commented

2026-04-22 04:26:27 -05:00

@remy415 commented on GitHub (Feb 26, 2024):

@ketsapiwiq have you tried installing rocm_v6 as per #2411 ?

@remy415 commented on GitHub (Feb 26, 2024): @ketsapiwiq have you tried installing rocm_v6 as per #2411 ?

GiteaMirror commented

2026-04-22 04:26:27 -05:00

@saamerm commented on GitHub (Feb 28, 2024):

Tried to update but that didn't help. I was able to follow instructions and do the OLLAMA_DEBUG=1 ollama run dolphin-phi and then did a sudo journalctl --no-pager -u ollama to get these logs: https://pastebin.com/9ZQVkuh1

Still not sure what the problem is though

ollama[549340]: [GIN] 2024/02/28 - 03:08:12 | 200 |    2.144736ms |       127.0.0.1 | HEAD     "/"
ollama[549340]: [GIN] 2024/02/28 - 03:08:12 | 200 |    8.552854ms |       127.0.0.1 | POST     "/api/show"
ollama[549340]: [GIN] 2024/02/28 - 03:08:12 | 200 |     461.818µs |       127.0.0.1 | POST     "/api/show"
ollama[549340]: time=2024-02-28T03:08:13.156Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
ollama[549340]: time=2024-02-28T03:08:13.156Z level=DEBUG source=amd.go:32 msg="amd driver not detected /sys/module/amdgpu"
ollama[549340]: time=2024-02-28T03:08:13.157Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
ollama[549340]: time=2024-02-28T03:08:13.157Z level=DEBUG source=amd.go:32 msg="amd driver not detected /sys/module/amdgpu"
ollama[549340]: time=2024-02-28T03:08:13.157Z level=INFO source=llm.go:77 msg="GPU not available, falling back to CPU"
ollama[549340]: time=2024-02-28T03:08:13.157Z level=DEBUG source=payload_common.go:93 msg="ordered list of LLM libraries to try [/tmp/ollama1451723426/cpu_avx2/libext_server.so]"
ollama[549340]: time=2024-02-28T03:08:13.174Z level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama1451723426/cpu_avx2/libext_server.so"
ollama[549340]: time=2024-02-28T03:08:13.174Z level=INFO source=dyn_ext_server.go:150 msg="Initializing llama server"
ollama[549340]: [1709089693] system info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 |
ollama[549340]: llama_model_loader: loaded meta data with 22 key-value pairs and 325 tensors from /usr/share/ollama/.ollama/models/blobs/sha256:4eca7304a07a42c48887f159ef5ad82ed5a5bd30fe52db4aadae1dd938e26f70 (version GGUF V3 (latest))
ollama[549340]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
ollama[549340]: llama_model_loader: - kv   0:                       general.architecture str              = phi2
ollama[549340]: llama_model_loader: - kv   1:                               general.name str              = Phi2
ollama[549340]: llama_model_loader: - kv   2:                        phi2.context_length u32              = 2048
ollama[549340]: llama_model_loader: - kv   3:                      phi2.embedding_length u32              = 2560
ollama[549340]: llama_model_loader: - kv   4:                   phi2.feed_forward_length u32              = 10240
ollama[549340]: llama_model_loader: - kv   5:                           phi2.block_count u32              = 32
ollama[549340]: llama_model_loader: - kv   6:                  phi2.attention.head_count u32              = 32
ollama[549340]: llama_model_loader: - kv   7:               phi2.attention.head_count_kv u32              = 32
ollama[549340]: llama_model_loader: - kv   8:          phi2.attention.layer_norm_epsilon f32              = 0.000010
ollama[549340]: llama_model_loader: - kv   9:                  phi2.rope.dimension_count u32              = 32
ollama[549340]: llama_model_loader: - kv  10:                          general.file_type u32              = 2
ollama[549340]: llama_model_loader: - kv  11:               tokenizer.ggml.add_bos_token bool             = false
ollama[549340]: llama_model_loader: - kv  12:                       tokenizer.ggml.model str              = gpt2
ollama[549340]: llama_model_loader: - kv  13:                      tokenizer.ggml.tokens arr[str,51200]   = ["!", "\"", "#", "$", "%", "&", "'", ...
ollama[549340]: llama_model_loader: - kv  14:                  tokenizer.ggml.token_type arr[i32,51200]   = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
ollama[549340]: llama_model_loader: - kv  15:                      tokenizer.ggml.merges arr[str,50000]   = ["Ġ t", "Ġ a", "h e", "i n", "r e",...
ollama[549340]: llama_model_loader: - kv  16:                tokenizer.ggml.bos_token_id u32              = 50256
ollama[549340]: llama_model_loader: - kv  17:                tokenizer.ggml.eos_token_id u32              = 50295
ollama[549340]: llama_model_loader: - kv  18:            tokenizer.ggml.unknown_token_id u32              = 50256
ollama[549340]: llama_model_loader: - kv  19:            tokenizer.ggml.padding_token_id u32              = 50256
ollama[549340]: llama_model_loader: - kv  20:                    tokenizer.chat_template str              = {{ bos_token }}{%- set ns = namespace...
ollama[549340]: llama_model_loader: - kv  21:               general.quantization_version u32              = 2
ollama[549340]: llama_model_loader: - type  f32:  195 tensors
ollama[549340]: llama_model_loader: - type q4_0:  129 tensors
ollama[549340]: llama_model_loader: - type q6_K:    1 tensors
ollama[549340]: llm_load_vocab: mismatch in special tokens definition ( 910/51200 vs 944/51200 ).
ollama[549340]: llm_load_print_meta: format           = GGUF V3 (latest)
ollama[549340]: llm_load_print_meta: arch             = phi2
ollama[549340]: llm_load_print_meta: vocab type       = BPE
ollama[549340]: llm_load_print_meta: n_vocab          = 51200
ollama[549340]: llm_load_print_meta: n_merges         = 50000
ollama[549340]: llm_load_print_meta: n_ctx_train      = 2048
ollama[549340]: llm_load_print_meta: n_embd           = 2560
ollama[549340]: llm_load_print_meta: n_head           = 32
ollama[549340]: llm_load_print_meta: n_head_kv        = 32
ollama[549340]: llm_load_print_meta: n_layer          = 32
ollama[549340]: llm_load_print_meta: n_rot            = 32
ollama[549340]: llm_load_print_meta: n_embd_head_k    = 80
ollama[549340]: llm_load_print_meta: n_embd_head_v    = 80
ollama[549340]: llm_load_print_meta: n_gqa            = 1
ollama[549340]: llm_load_print_meta: n_embd_k_gqa     = 2560
ollama[549340]: llm_load_print_meta: n_embd_v_gqa     = 2560
ollama[549340]: llm_load_print_meta: f_norm_eps       = 1.0e-05
ollama[549340]: llm_load_print_meta: f_norm_rms_eps   = 0.0e+00
ollama[549340]: llm_load_print_meta: f_clamp_kqv      = 0.0e+00
ollama[549340]: llm_load_print_meta: f_max_alibi_bias = 0.0e+00
ollama[549340]: llm_load_print_meta: n_ff             = 10240
ollama[549340]: llm_load_print_meta: n_expert         = 0
ollama[549340]: llm_load_print_meta: n_expert_used    = 0
ollama[549340]: llm_load_print_meta: rope scaling     = linear
ollama[549340]: llm_load_print_meta: freq_base_train  = 10000.0
ollama[549340]: llm_load_print_meta: freq_scale_train = 1
ollama[549340]: llm_load_print_meta: n_yarn_orig_ctx  = 2048
ollama[549340]: llm_load_print_meta: rope_finetuned   = unknown
ollama[549340]: llm_load_print_meta: model type       = 3B
ollama[549340]: llm_load_print_meta: model ftype      = Q4_0
ollama[549340]: llm_load_print_meta: model params     = 2.78 B
ollama[549340]: llm_load_print_meta: model size       = 1.49 GiB (4.61 BPW)
ollama[549340]: llm_load_print_meta: general.name     = Phi2
ollama[549340]: llm_load_print_meta: BOS token        = 50256 '<|endoftext|>'
ollama[549340]: llm_load_print_meta: EOS token        = 50295 '<|im_end|>'
ollama[549340]: llm_load_print_meta: UNK token        = 50256 '<|endoftext|>'
ollama[549340]: llm_load_print_meta: PAD token        = 50256 '<|endoftext|>'
ollama[549340]: llm_load_print_meta: LF token         = 128 'Ä'
ollama[549340]: llm_load_tensors: ggml ctx size =    0.12 MiB
ollama[549340]: llm_load_tensors:        CPU buffer size =  1526.50 MiB
ollama[549340]: ...........................................................................................
ollama[549340]: llama_new_context_with_model: n_ctx      = 2048
ollama[549340]: llama_new_context_with_model: freq_base  = 10000.0
ollama[549340]: llama_new_context_with_model: freq_scale = 1
systemd[1]: ollama.service: Main process exited, code=killed, status=9/KILL
systemd[1]: ollama.service: Failed with result 'signal'.
systemd[1]: ollama.service: Consumed 8.800s CPU time.
systemd[1]: ollama.service: Scheduled restart job, restart counter is at 9.
systemd[1]: Stopped Ollama Service.
systemd[1]: ollama.service: Consumed 8.800s CPU time.
systemd[1]: Started Ollama Service.
ollama[549401]: time=2024-02-28T03:08:42.426Z level=INFO source=images.go:710 msg="total blobs: 6"
ollama[549401]: time=2024-02-28T03:08:42.441Z level=INFO source=images.go:717 msg="total unused blobs removed: 0"
ollama[549401]: time=2024-02-28T03:08:42.454Z level=INFO source=routes.go:1019 msg="Listening on 127.0.0.1:11434 (version 0.1.27)"
ollama[549401]: time=2024-02-28T03:08:42.455Z level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..."
ollama[549401]: time=2024-02-28T03:08:52.469Z level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [rocm_v6 cpu cpu_avx cpu_avx2 cuda_v11 rocm_v5]"
ollama[549401]: time=2024-02-28T03:08:52.470Z level=DEBUG source=payload_common.go:147 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY"
ollama[549401]: time=2024-02-28T03:08:52.470Z level=INFO source=gpu.go:94 msg="Detecting GPU type"
ollama[549401]: time=2024-02-28T03:08:52.470Z level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so"
ollama[549401]: time=2024-02-28T03:08:52.470Z level=DEBUG source=gpu.go:283 msg="gpu management search paths: [/usr/local/cuda/lib64/libnvidia-ml.so* /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so* /usr/lib/x86_64-linux-gnu/libnvidia-ml.so* /usr/lib/wsl/lib/libnvidia-ml.so* /usr/lib/wsl/drivers/*/libnvidia-ml.so* /opt/cuda/lib64/libnvidia-ml.so* /usr/lib*/libnvidia-ml.so* /usr/local/lib*/libnvidia-ml.so* /usr/lib/aarch64-linux-gnu/nvidia/current/libnvidia-ml.so* /usr/lib/aarch64-linux-gnu/libnvidia-ml.so* /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so* /libnvidia-ml.so*]"
ollama[549401]: time=2024-02-28T03:08:52.476Z level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
ollama[549401]: time=2024-02-28T03:08:52.476Z level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so"
ollama[549401]: time=2024-02-28T03:08:52.476Z level=DEBUG source=gpu.go:283 msg="gpu management search paths: [/opt/rocm*/lib*/librocm_smi64.so* /librocm_smi64.so*]"
ollama[549401]: time=2024-02-28T03:08:52.476Z level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
ollama[549401]: time=2024-02-28T03:08:52.476Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
ollama[549401]: time=2024-02-28T03:08:52.476Z level=DEBUG source=amd.go:32 msg="amd driver not detected /sys/module/amdgpu"
ollama[549401]: time=2024-02-28T03:08:52.477Z level=INFO source=routes.go:1042 msg="no GPU detected"

@saamerm commented on GitHub (Feb 28, 2024): Tried to update but that didn't help. I was able to follow instructions and do the `OLLAMA_DEBUG=1 ollama run dolphin-phi` and then did a `sudo journalctl --no-pager -u ollama` to get these logs: [https://pastebin.com/9ZQVkuh1](https://pastebin.com/9ZQVkuh1) Still not sure what the problem is though ``` ollama[549340]: [GIN] 2024/02/28 - 03:08:12 | 200 | 2.144736ms | 127.0.0.1 | HEAD "/" ollama[549340]: [GIN] 2024/02/28 - 03:08:12 | 200 | 8.552854ms | 127.0.0.1 | POST "/api/show" ollama[549340]: [GIN] 2024/02/28 - 03:08:12 | 200 | 461.818µs | 127.0.0.1 | POST "/api/show" ollama[549340]: time=2024-02-28T03:08:13.156Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2" ollama[549340]: time=2024-02-28T03:08:13.156Z level=DEBUG source=amd.go:32 msg="amd driver not detected /sys/module/amdgpu" ollama[549340]: time=2024-02-28T03:08:13.157Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2" ollama[549340]: time=2024-02-28T03:08:13.157Z level=DEBUG source=amd.go:32 msg="amd driver not detected /sys/module/amdgpu" ollama[549340]: time=2024-02-28T03:08:13.157Z level=INFO source=llm.go:77 msg="GPU not available, falling back to CPU" ollama[549340]: time=2024-02-28T03:08:13.157Z level=DEBUG source=payload_common.go:93 msg="ordered list of LLM libraries to try [/tmp/ollama1451723426/cpu_avx2/libext_server.so]" ollama[549340]: time=2024-02-28T03:08:13.174Z level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/ollama1451723426/cpu_avx2/libext_server.so" ollama[549340]: time=2024-02-28T03:08:13.174Z level=INFO source=dyn_ext_server.go:150 msg="Initializing llama server" ollama[549340]: [1709089693] system info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | ollama[549340]: llama_model_loader: loaded meta data with 22 key-value pairs and 325 tensors from /usr/share/ollama/.ollama/models/blobs/sha256:4eca7304a07a42c48887f159ef5ad82ed5a5bd30fe52db4aadae1dd938e26f70 (version GGUF V3 (latest)) ollama[549340]: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. ollama[549340]: llama_model_loader: - kv 0: general.architecture str = phi2 ollama[549340]: llama_model_loader: - kv 1: general.name str = Phi2 ollama[549340]: llama_model_loader: - kv 2: phi2.context_length u32 = 2048 ollama[549340]: llama_model_loader: - kv 3: phi2.embedding_length u32 = 2560 ollama[549340]: llama_model_loader: - kv 4: phi2.feed_forward_length u32 = 10240 ollama[549340]: llama_model_loader: - kv 5: phi2.block_count u32 = 32 ollama[549340]: llama_model_loader: - kv 6: phi2.attention.head_count u32 = 32 ollama[549340]: llama_model_loader: - kv 7: phi2.attention.head_count_kv u32 = 32 ollama[549340]: llama_model_loader: - kv 8: phi2.attention.layer_norm_epsilon f32 = 0.000010 ollama[549340]: llama_model_loader: - kv 9: phi2.rope.dimension_count u32 = 32 ollama[549340]: llama_model_loader: - kv 10: general.file_type u32 = 2 ollama[549340]: llama_model_loader: - kv 11: tokenizer.ggml.add_bos_token bool = false ollama[549340]: llama_model_loader: - kv 12: tokenizer.ggml.model str = gpt2 ollama[549340]: llama_model_loader: - kv 13: tokenizer.ggml.tokens arr[str,51200] = ["!", "\"", "#", "$", "%", "&", "'", ... ollama[549340]: llama_model_loader: - kv 14: tokenizer.ggml.token_type arr[i32,51200] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... ollama[549340]: llama_model_loader: - kv 15: tokenizer.ggml.merges arr[str,50000] = ["Ġ t", "Ġ a", "h e", "i n", "r e",... ollama[549340]: llama_model_loader: - kv 16: tokenizer.ggml.bos_token_id u32 = 50256 ollama[549340]: llama_model_loader: - kv 17: tokenizer.ggml.eos_token_id u32 = 50295 ollama[549340]: llama_model_loader: - kv 18: tokenizer.ggml.unknown_token_id u32 = 50256 ollama[549340]: llama_model_loader: - kv 19: tokenizer.ggml.padding_token_id u32 = 50256 ollama[549340]: llama_model_loader: - kv 20: tokenizer.chat_template str = {{ bos_token }}{%- set ns = namespace... ollama[549340]: llama_model_loader: - kv 21: general.quantization_version u32 = 2 ollama[549340]: llama_model_loader: - type f32: 195 tensors ollama[549340]: llama_model_loader: - type q4_0: 129 tensors ollama[549340]: llama_model_loader: - type q6_K: 1 tensors ollama[549340]: llm_load_vocab: mismatch in special tokens definition ( 910/51200 vs 944/51200 ). ollama[549340]: llm_load_print_meta: format = GGUF V3 (latest) ollama[549340]: llm_load_print_meta: arch = phi2 ollama[549340]: llm_load_print_meta: vocab type = BPE ollama[549340]: llm_load_print_meta: n_vocab = 51200 ollama[549340]: llm_load_print_meta: n_merges = 50000 ollama[549340]: llm_load_print_meta: n_ctx_train = 2048 ollama[549340]: llm_load_print_meta: n_embd = 2560 ollama[549340]: llm_load_print_meta: n_head = 32 ollama[549340]: llm_load_print_meta: n_head_kv = 32 ollama[549340]: llm_load_print_meta: n_layer = 32 ollama[549340]: llm_load_print_meta: n_rot = 32 ollama[549340]: llm_load_print_meta: n_embd_head_k = 80 ollama[549340]: llm_load_print_meta: n_embd_head_v = 80 ollama[549340]: llm_load_print_meta: n_gqa = 1 ollama[549340]: llm_load_print_meta: n_embd_k_gqa = 2560 ollama[549340]: llm_load_print_meta: n_embd_v_gqa = 2560 ollama[549340]: llm_load_print_meta: f_norm_eps = 1.0e-05 ollama[549340]: llm_load_print_meta: f_norm_rms_eps = 0.0e+00 ollama[549340]: llm_load_print_meta: f_clamp_kqv = 0.0e+00 ollama[549340]: llm_load_print_meta: f_max_alibi_bias = 0.0e+00 ollama[549340]: llm_load_print_meta: n_ff = 10240 ollama[549340]: llm_load_print_meta: n_expert = 0 ollama[549340]: llm_load_print_meta: n_expert_used = 0 ollama[549340]: llm_load_print_meta: rope scaling = linear ollama[549340]: llm_load_print_meta: freq_base_train = 10000.0 ollama[549340]: llm_load_print_meta: freq_scale_train = 1 ollama[549340]: llm_load_print_meta: n_yarn_orig_ctx = 2048 ollama[549340]: llm_load_print_meta: rope_finetuned = unknown ollama[549340]: llm_load_print_meta: model type = 3B ollama[549340]: llm_load_print_meta: model ftype = Q4_0 ollama[549340]: llm_load_print_meta: model params = 2.78 B ollama[549340]: llm_load_print_meta: model size = 1.49 GiB (4.61 BPW) ollama[549340]: llm_load_print_meta: general.name = Phi2 ollama[549340]: llm_load_print_meta: BOS token = 50256 '<|endoftext|>' ollama[549340]: llm_load_print_meta: EOS token = 50295 '<|im_end|>' ollama[549340]: llm_load_print_meta: UNK token = 50256 '<|endoftext|>' ollama[549340]: llm_load_print_meta: PAD token = 50256 '<|endoftext|>' ollama[549340]: llm_load_print_meta: LF token = 128 'Ä' ollama[549340]: llm_load_tensors: ggml ctx size = 0.12 MiB ollama[549340]: llm_load_tensors: CPU buffer size = 1526.50 MiB ollama[549340]: ........................................................................................... ollama[549340]: llama_new_context_with_model: n_ctx = 2048 ollama[549340]: llama_new_context_with_model: freq_base = 10000.0 ollama[549340]: llama_new_context_with_model: freq_scale = 1 systemd[1]: ollama.service: Main process exited, code=killed, status=9/KILL systemd[1]: ollama.service: Failed with result 'signal'. systemd[1]: ollama.service: Consumed 8.800s CPU time. systemd[1]: ollama.service: Scheduled restart job, restart counter is at 9. systemd[1]: Stopped Ollama Service. systemd[1]: ollama.service: Consumed 8.800s CPU time. systemd[1]: Started Ollama Service. ollama[549401]: time=2024-02-28T03:08:42.426Z level=INFO source=images.go:710 msg="total blobs: 6" ollama[549401]: time=2024-02-28T03:08:42.441Z level=INFO source=images.go:717 msg="total unused blobs removed: 0" ollama[549401]: time=2024-02-28T03:08:42.454Z level=INFO source=routes.go:1019 msg="Listening on 127.0.0.1:11434 (version 0.1.27)" ollama[549401]: time=2024-02-28T03:08:42.455Z level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..." ollama[549401]: time=2024-02-28T03:08:52.469Z level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [rocm_v6 cpu cpu_avx cpu_avx2 cuda_v11 rocm_v5]" ollama[549401]: time=2024-02-28T03:08:52.470Z level=DEBUG source=payload_common.go:147 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY" ollama[549401]: time=2024-02-28T03:08:52.470Z level=INFO source=gpu.go:94 msg="Detecting GPU type" ollama[549401]: time=2024-02-28T03:08:52.470Z level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so" ollama[549401]: time=2024-02-28T03:08:52.470Z level=DEBUG source=gpu.go:283 msg="gpu management search paths: [/usr/local/cuda/lib64/libnvidia-ml.so* /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so* /usr/lib/x86_64-linux-gnu/libnvidia-ml.so* /usr/lib/wsl/lib/libnvidia-ml.so* /usr/lib/wsl/drivers/*/libnvidia-ml.so* /opt/cuda/lib64/libnvidia-ml.so* /usr/lib*/libnvidia-ml.so* /usr/local/lib*/libnvidia-ml.so* /usr/lib/aarch64-linux-gnu/nvidia/current/libnvidia-ml.so* /usr/lib/aarch64-linux-gnu/libnvidia-ml.so* /opt/cuda/targets/x86_64-linux/lib/stubs/libnvidia-ml.so* /libnvidia-ml.so*]" ollama[549401]: time=2024-02-28T03:08:52.476Z level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []" ollama[549401]: time=2024-02-28T03:08:52.476Z level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so" ollama[549401]: time=2024-02-28T03:08:52.476Z level=DEBUG source=gpu.go:283 msg="gpu management search paths: [/opt/rocm*/lib*/librocm_smi64.so* /librocm_smi64.so*]" ollama[549401]: time=2024-02-28T03:08:52.476Z level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []" ollama[549401]: time=2024-02-28T03:08:52.476Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2" ollama[549401]: time=2024-02-28T03:08:52.476Z level=DEBUG source=amd.go:32 msg="amd driver not detected /sys/module/amdgpu" ollama[549401]: time=2024-02-28T03:08:52.477Z level=INFO source=routes.go:1042 msg="no GPU detected" ```

GiteaMirror commented

2026-04-22 04:26:28 -05:00

@ketsapiwiq commented on GitHub (Feb 28, 2024):

@ketsapiwiq have you tried installing rocm_v6 as per #2411 ?

Thank you, doing that at the moment. That's probably it, I had not seen the specific ROCm instructions in https://github.com/ollama/ollama/blob/main/docs/development.md#linux-rocm-amd

There's also a nice aur/ollama-rocm-git package in Arch that looks interesting.

@ketsapiwiq commented on GitHub (Feb 28, 2024): > @ketsapiwiq have you tried installing rocm_v6 as per #2411 ? Thank you, doing that at the moment. That's probably it, I had not seen the specific ROCm instructions in https://github.com/ollama/ollama/blob/main/docs/development.md#linux-rocm-amd There's also a nice `aur/ollama-rocm-git` package in Arch that looks interesting.

GiteaMirror commented

2026-04-22 04:26:29 -05:00

@remy415 commented on GitHub (Feb 28, 2024):

@saamerm

There's a different syntax to get debug enabled when running as a service.

First edit the ollama service file sudo nano /etc/systemd/system/ollama.service Add the environment variable to it
[Service]
(leave these options alone…)
Environment=“OLLAMA_DEBUG=“1”
Then restart the service. Daemon-reload has the service reload their configuration files.
sudo systemctl daemon-reload
sudo systemctl restart ollama.service
Now when you view your logs it will be with debug enabled. Note that if you are running ollama serve from the command line, you would do it the way you quoted OLLAMA_DEBUG=“1” ollama serve Or export the variable so it stays enabled on your current terminal session export OLLAMA_DEBUG=“1”

Since you are running it as a service according to your logs, you need to add the env variable to your service file.

@ketsapiwiq Awesome, hope rocm_v6 helps. Let me know please

@remy415 commented on GitHub (Feb 28, 2024): @saamerm > There's a different syntax to get debug enabled when running as a service. > > First edit the ollama service file `sudo nano /etc/systemd/system/ollama.service` Add the environment variable to it > > ``` > [Service] > (leave these options alone…) > Environment=“OLLAMA_DEBUG=“1” > ``` > > Then restart the service. Daemon-reload has the service reload their configuration files. > > ``` > sudo systemctl daemon-reload > sudo systemctl restart ollama.service > ``` > > Now when you view your logs it will be with debug enabled. Note that if you are running ollama serve from the command line, you would do it the way you quoted `OLLAMA_DEBUG=“1” ollama serve` Or export the variable so it stays enabled on your current terminal session `export OLLAMA_DEBUG=“1”` Since you are running it as a service according to your logs, you need to add the env variable to your service file. @ketsapiwiq Awesome, hope rocm_v6 helps. Let me know please

GiteaMirror commented

2026-04-22 04:26:29 -05:00

@ketsapiwiq commented on GitHub (Feb 28, 2024):

@remy415
Thank you so much, it works!!!
EDIT: apparently nothing except the CLBlast_DIR and ROCM_PATH wars were useful
List of env vars I included:

export CLBlast_DIR=/usr/lib/cmake/CLBlast
export ROCM_PATH=/opt/rocm
export HSA_OVERRIDE_GFX_VERSION=10.3.0
export AMDGPU_TARGETS="gfx1030"

For the record, I don't know if those had an effect:

I updated my ROCm Arch packages, and did rebuild amdgpu-dkms (which seemed to fail previously), although I didn't even reboot and I still don't have /sys/module/amdgpu/version

@ketsapiwiq commented on GitHub (Feb 28, 2024): @remy415 Thank you so much, it works!!! EDIT: apparently nothing except the CLBlast_DIR and ROCM_PATH wars were useful List of env vars I included: ```bash export CLBlast_DIR=/usr/lib/cmake/CLBlast export ROCM_PATH=/opt/rocm export HSA_OVERRIDE_GFX_VERSION=10.3.0 export AMDGPU_TARGETS="gfx1030" ``` For the record, I don't know if those had an effect: - I updated my ROCm Arch packages, and did rebuild `amdgpu-dkms` (which seemed to fail previously), although I didn't even reboot and I still don't have `/sys/module/amdgpu/version`

GiteaMirror commented

2026-04-22 04:26:30 -05:00

@remy415 commented on GitHub (Feb 28, 2024):

@ketsapiwiq If you're interested in figuring out if those had an effect, simply remove them and add them one by one (singularly) until it works. Conversely you could remove them one at a time until it doesn't work.

Anyway congrats I'm glad you got it working!

@remy415 commented on GitHub (Feb 28, 2024): @ketsapiwiq If you're interested in figuring out if those had an effect, simply remove them and add them one by one (singularly) until it works. Conversely you could remove them one at a time until it doesn't work. Anyway congrats I'm glad you got it working!

GiteaMirror commented

2026-04-22 04:26:31 -05:00

@cinnamon17 commented on GitHub (Mar 10, 2024):

Hi i'm having the same error: Error: Post "http://127.0.0.1:11434/api/chat": EOF | Error: exception std::bad_alloc

model: gemma:2b
ollama version: 0.1.28

Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_ctx_train = 8192Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd = 2048Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_head = 8Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_head_kv = 1Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_layer = 18Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_rot = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd_head_k = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd_head_v = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_gqa = 8Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd_k_gqa = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd_v_gqa = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: f_norm_eps = 0.0e+00Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: f_norm_rms_eps = 1.0e-06Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: f_clamp_kqv = 0.0e+00Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: f_max_alibi_bias = 0.0e+00Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_ff = 16384Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_expert = 0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_expert_used = 0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: pooling type = 0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: rope type = 2Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: rope scaling = linearMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: freq_base_train = 10000.0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: freq_scale_train = 1Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_yarn_orig_ctx = 8192Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: rope_finetuned = unknownMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: model type = 2BMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: model ftype = Q4_0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: model params = 2.51 BMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: model size = 1.56 GiB (5.34 BPW)Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: general.name = gemma-2b-itMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: BOS token = 2 '<bos>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: EOS token = 1 '<eos>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: UNK token = 3 '<unk>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: PAD token = 0 '<pad>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: LF token = 227 '<0x0A>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_tensors: ggml ctx size = 0.06 MiBMar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llm_load_tensors: CPU buffer size = 1594.93 MiBMar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: .....................................................Mar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_new_context_with_model: n_ctx = 2048Mar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_new_context_with_model: freq_base = 10000.0Mar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_new_context_with_model: freq_scale = 1Mar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_kv_cache_init: CPU KV buffer size = 36.00 MiBMar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_new_context_with_model: KV self size = 36.00 MiB, K (f16): 18.00 MiB, V (f16): 18.00 MiBMar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: time=2024-03-10T13:02:53.818Z level=WARN source=llm.go:162 msg="Failed to load dynamic library /tmp/ollama3814554119/cpu_avx2/libext_server.so exception std::bad_alloc"

@cinnamon17 commented on GitHub (Mar 10, 2024): Hi i'm having the same error: Error: Post "http://127.0.0.1:11434/api/chat": EOF | Error: exception std::bad_alloc model: gemma:2b ollama version: 0.1.28 `Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_ctx_train = 8192Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd = 2048Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_head = 8Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_head_kv = 1Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_layer = 18Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_rot = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd_head_k = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd_head_v = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_gqa = 8Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd_k_gqa = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_embd_v_gqa = 256Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: f_norm_eps = 0.0e+00Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: f_norm_rms_eps = 1.0e-06Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: f_clamp_kqv = 0.0e+00Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: f_max_alibi_bias = 0.0e+00Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_ff = 16384Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_expert = 0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_expert_used = 0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: pooling type = 0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: rope type = 2Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: rope scaling = linearMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: freq_base_train = 10000.0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: freq_scale_train = 1Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: n_yarn_orig_ctx = 8192Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: rope_finetuned = unknownMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: model type = 2BMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: model ftype = Q4_0Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: model params = 2.51 BMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: model size = 1.56 GiB (5.34 BPW)Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: general.name = gemma-2b-itMar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: BOS token = 2 '<bos>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: EOS token = 1 '<eos>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: UNK token = 3 '<unk>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: PAD token = 0 '<pad>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_print_meta: LF token = 227 '<0x0A>'Mar 10 13:02:27 ip-172-26-15-150 ollama[3649052]: llm_load_tensors: ggml ctx size = 0.06 MiBMar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llm_load_tensors: CPU buffer size = 1594.93 MiBMar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: .....................................................Mar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_new_context_with_model: n_ctx = 2048Mar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_new_context_with_model: freq_base = 10000.0Mar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_new_context_with_model: freq_scale = 1Mar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_kv_cache_init: CPU KV buffer size = 36.00 MiBMar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: llama_new_context_with_model: KV self size = 36.00 MiB, K (f16): 18.00 MiB, V (f16): 18.00 MiBMar 10 13:02:53 ip-172-26-15-150 ollama[3649052]: time=2024-03-10T13:02:53.818Z level=WARN source=llm.go:162 msg="Failed to load dynamic library /tmp/ollama3814554119/cpu_avx2/libext_server.so exception std::bad_alloc"`

GiteaMirror commented

2026-04-22 04:26:32 -05:00

@remy415 commented on GitHub (Mar 10, 2024):

@cinnamon17 Hello, could you please provide more information?

Needed:
OS, CPU, GPU, driver version number, and the rest of the server log. It would appear you are currently running the avx_2 build as displayed with /tmp/ollama3814554119/cpu_avx2/libext_server.so exception std::bad_alloc. If you're running a CPU that doesn't have AVX2 (Intel prior to Haswell circa ~2013 or so) or any ARM processor (Mac M1/2/3, Jetson devices), that build will fail as it's trying to access CPU extensions that aren't present.

@remy415 commented on GitHub (Mar 10, 2024): @cinnamon17 Hello, could you please provide more information? Needed: OS, CPU, GPU, driver version number, and the rest of the server log. It would appear you are currently running the avx_2 build as displayed with `/tmp/ollama3814554119/cpu_avx2/libext_server.so exception std::bad_alloc`. If you're running a CPU that doesn't have AVX2 (Intel prior to Haswell circa ~2013 or so) or any ARM processor (Mac M1/2/3, Jetson devices), that build will fail as it's trying to access CPU extensions that aren't present.

GiteaMirror commented

2026-04-22 04:26:33 -05:00

@cinnamon17 commented on GitHub (Mar 10, 2024):

@remy415 Hi!

CPU: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
OS: Debian GNU/Linux 11 (bullseye) x86_64
GPU: 00:02.0 Cirrus Logic GD 5446 - I don't have one, is a hosting server but this is what it prints

Extended log:

ollama.log

@cinnamon17 commented on GitHub (Mar 10, 2024): @remy415 Hi! CPU: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz OS: Debian GNU/Linux 11 (bullseye) x86_64 GPU: 00:02.0 Cirrus Logic GD 5446 - I don't have one, is a hosting server but this is what it prints Extended log: [ollama.log](https://pastebin.com/1vhXKn7f)

GiteaMirror commented

2026-04-22 04:26:34 -05:00

@remy415 commented on GitHub (Mar 10, 2024):

@cinnamon17 Thank you! How much RAM is available to your hosted server? The log is reporting ~1.5 Gb available to the CPU, maybe the model is too large for the amount of free memory? Try ollama run tinyllama and see if that works correctly -- that only occupies ~700Mb of RAM.

I may have been wrong about the AVX2 being the issue, if so I apologize. If you enter cat /proc/cpuinfo | grep -i avx, can you confirm the presence of avx2 in the list anywhere? That would rule that out as an issue.

@remy415 commented on GitHub (Mar 10, 2024): @cinnamon17 Thank you! How much RAM is available to your hosted server? The log is reporting ~1.5 Gb available to the CPU, maybe the model is too large for the amount of free memory? Try `ollama run tinyllama` and see if that works correctly -- that only occupies ~700Mb of RAM. I may have been wrong about the AVX2 being the issue, if so I apologize. If you enter `cat /proc/cpuinfo | grep -i avx`, can you confirm the presence of `avx2` in the list anywhere? That would rule that out as an issue.

GiteaMirror commented

2026-04-22 04:26:35 -05:00

@cinnamon17 commented on GitHub (Mar 10, 2024):

It worked! how much memory requires large models? I have 2 GB of ram available

The result of running cat /proc/cpuinfo | grep -i avx is:

flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault invpcid_single pti fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt

@cinnamon17 commented on GitHub (Mar 10, 2024): It worked! how much memory requires large models? I have 2 GB of ram available The result of running `cat /proc/cpuinfo | grep -i avx` is: `flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault invpcid_single pti fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt`

GiteaMirror commented

2026-04-22 04:26:37 -05:00

@remy415 commented on GitHub (Mar 10, 2024):

With 2GB of RAM, you won't be able to run much. Most 7b LLMs eat 4-8GB of RAM. The gemma:2b you tried to run is ~1.5GB in size (your system reported ~1.5GB free). Ollama automatically sets 10% or 1GB of buffer space, whichever is lower; this would limit your model size on your system to <1GB.

@remy415 commented on GitHub (Mar 10, 2024): With 2GB of RAM, you won't be able to run much. Most 7b LLMs eat 4-8GB of RAM. The gemma:2b you tried to run is ~1.5GB in size (your system reported ~1.5GB free). Ollama automatically sets 10% or 1GB of buffer space, whichever is lower; this would limit your model size on your system to <1GB.

GiteaMirror commented

2026-04-22 04:26:38 -05:00

@cinnamon17 commented on GitHub (Mar 10, 2024):

@remy415 I see, I'll upgrade the RAM memory. Thank you for your time; I really appreciate that.

@cinnamon17 commented on GitHub (Mar 10, 2024): @remy415 I see, I'll upgrade the RAM memory. Thank you for your time; I really appreciate that.

GiteaMirror commented

2026-04-22 04:26:40 -05:00

@remy415 commented on GitHub (Mar 10, 2024):

Not a problem, good luck!

@remy415 commented on GitHub (Mar 10, 2024): Not a problem, good luck!

GiteaMirror commented

2026-04-22 04:26:42 -05:00

@dhiltgen commented on GitHub (Mar 11, 2024):

@saamerm are you still having trouble running dolphin-phi on a linux system?

If so, can you try upgrading to the latest release and share your server log?

(I just tried to repro on 0.1.29 with a Nvidia GT 1030 GPU (4G) and the model loads and works properly for me)

@dhiltgen commented on GitHub (Mar 11, 2024): @saamerm are you still having trouble running dolphin-phi on a linux system? If so, can you try upgrading to the latest release and share your server log? (I just tried to repro on 0.1.29 with a Nvidia GT 1030 GPU (4G) and the model loads and works properly for me)

GiteaMirror commented

2026-04-22 04:26:45 -05:00

@pjbrunet commented on GitHub (Mar 12, 2024):

I just get this error with starcoder2

And same error if I do this...
ollama rm starcoder2:15b
ollama run starcoder2

No problem with other models.
ollama version is 0.1.28

@pjbrunet commented on GitHub (Mar 12, 2024): I just get this error with starcoder2 And same error if I do this... ollama rm starcoder2:15b ollama run starcoder2 No problem with other models. ollama version is 0.1.28

GiteaMirror commented

2026-04-22 04:26:46 -05:00

@miharekar commented on GitHub (Mar 12, 2024):

@pjbrunet that's the issue we discussed in #2953. I got it to run by building ollama myself.

@miharekar commented on GitHub (Mar 12, 2024): @pjbrunet that's the issue we discussed in #2953. I got it to run by building ollama myself.

GiteaMirror commented

2026-04-22 04:26:47 -05:00

@dhiltgen commented on GitHub (Mar 13, 2024):

There may be multiple different topics in this issue. @saamerm can you clarify what GPU you have, or is this a CPU only system? Your log earlier showed no GPU being detected, but other commenters on this ticket are on Radeon systems with iGPUs. I just fixed an iGPU discovery bug yesterday which will make its way into the final 0.1.29 release. (updated builds with the fix should be posted later today)

@dhiltgen commented on GitHub (Mar 13, 2024): There may be multiple different topics in this issue. @saamerm can you clarify what GPU you have, or is this a CPU only system? Your log earlier showed no GPU being detected, but other commenters on this ticket are on Radeon systems with iGPUs. I just fixed an iGPU discovery bug yesterday which will make its way into the final 0.1.29 release. (updated builds with the fix should be posted later today)

GiteaMirror commented

2026-04-22 04:26:47 -05:00

@dhiltgen commented on GitHub (Apr 15, 2024):

It sounds like @saamerm is no longer having his original problem.

If folks who commented on this PR are still having problems, please open a new issue describing your problem, and share your server log so we can try to understand what went wrong.

@dhiltgen commented on GitHub (Apr 15, 2024): It sounds like @saamerm is no longer having his original problem. If folks who commented on this PR are still having problems, please open a new issue describing your problem, and share your server log so we can try to understand what went wrong.

Sign in to join this conversation.

Branches Tags

main

dhiltgen/ci

parth-launch-plan-gating

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#27257

[GH-ISSUE #2555] EOF error on /api/chat or /api/generate #27257

[GH-ISSUE #2555] `EOF` error on `/api/chat` or `/api/generate` #27257