[GH-ISSUE #5090] amdgpu version file missing when running via systemd #28971

Closed
opened 2026-04-22 07:33:26 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @ghost on GitHub (Jun 16, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5090

Previous issue was closed as fixed but the bug still exists.

Hi, this doesn't happen to me when running ollama as root directly in a shell, but it happens when I start ollama as a service (regardless of the user):

amnesia λ ~/ sudo systemctl status ollama 
● ollama.service - Ollama Service
     Loaded: loaded (/etc/systemd/system/ollama.service; enabled; preset: disabled)
     Active: active (running) since Sun 2024-06-16 16:47:04 PDT; 39s ago
   Main PID: 7273 (ollama)
      Tasks: 18 (limit: 38365)
     Memory: 561.0M (peak: 615.8M)
        CPU: 5.644s
     CGroup: /system.slice/ollama.service
             └─7273 /usr/local/bin/ollama serve

Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.252-07:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION="\"10.3.0\""
Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.506-07:00 level=WARN source=amd_linux.go:48 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sy>
Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.511-07:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION="\"10.3.0\""
Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.714-07:00 level=WARN source=sched.go:511 msg="gpu VRAM usage didn't recover within timeout" seconds=5.045401852
Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.749-07:00 level=WARN source=amd_linux.go:48 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sy>
Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.753-07:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION="\"10.3.0\""
Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.964-07:00 level=WARN source=sched.go:511 msg="gpu VRAM usage didn't recover within timeout" seconds=5.295260429
Jun 16 16:47:38 dead ollama[7273]: time=2024-06-16T16:47:38.007-07:00 level=WARN source=amd_linux.go:48 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sy>
Jun 16 16:47:38 dead ollama[7273]: time=2024-06-16T16:47:38.012-07:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION="\"10.3.0\""
Jun 16 16:47:38 dead ollama[7273]: time=2024-06-16T16:47:38.214-07:00 level=WARN source=sched.go:511 msg="gpu VRAM usage didn't recover within timeout" seconds=5.545797232
                                                                                                                            

But somehow:

amnesia λ ~/ sudo ROCR_VISIBLE_DEVICES=0 HSA_OVERRIDE_GFX_VERSION="10.3.0" OLLAMA_DEBUG=1 ollama serve

Works fine and I can chat without issue. Here's my service file, please note I have tried with both the ollama user and the root user (and the ollama user is properly configured/in render & video group):

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/local/bin/ollama serve
User=root
Group=root
Restart=always
RestartSec=3
Environment="PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin:/usr/local/lib/baresip/modules"
Environment="ROCR_VISIBLE_DEVICES=0"
Environment="HSA_OVERRIDE_GFX_VERSION=\"10.3.0\""
[Install]
WantedBy=default.target

Both in the shell & run as a service they report using the same GPU (id=0, 6700XT):

level=INFO source=amd_linux.go:71 msg="inference compute" id=0 library=rocm compute=gfx1031 driver=0.0 name=1002:73df total="12.0 GiB" available="12.0 GiB"

Originally posted by @pulpocaminante in https://github.com/ollama/ollama/issues/4427#issuecomment-2171941948

Originally created by @ghost on GitHub (Jun 16, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5090 Previous issue was closed as fixed but the bug still exists. Hi, this doesn't happen to me when running ollama as root directly in a shell, but it happens when I start ollama as a service (regardless of the user): ``` amnesia λ ~/ sudo systemctl status ollama ● ollama.service - Ollama Service Loaded: loaded (/etc/systemd/system/ollama.service; enabled; preset: disabled) Active: active (running) since Sun 2024-06-16 16:47:04 PDT; 39s ago Main PID: 7273 (ollama) Tasks: 18 (limit: 38365) Memory: 561.0M (peak: 615.8M) CPU: 5.644s CGroup: /system.slice/ollama.service └─7273 /usr/local/bin/ollama serve Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.252-07:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION="\"10.3.0\"" Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.506-07:00 level=WARN source=amd_linux.go:48 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sy> Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.511-07:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION="\"10.3.0\"" Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.714-07:00 level=WARN source=sched.go:511 msg="gpu VRAM usage didn't recover within timeout" seconds=5.045401852 Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.749-07:00 level=WARN source=amd_linux.go:48 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sy> Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.753-07:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION="\"10.3.0\"" Jun 16 16:47:37 dead ollama[7273]: time=2024-06-16T16:47:37.964-07:00 level=WARN source=sched.go:511 msg="gpu VRAM usage didn't recover within timeout" seconds=5.295260429 Jun 16 16:47:38 dead ollama[7273]: time=2024-06-16T16:47:38.007-07:00 level=WARN source=amd_linux.go:48 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sy> Jun 16 16:47:38 dead ollama[7273]: time=2024-06-16T16:47:38.012-07:00 level=INFO source=amd_linux.go:304 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION="\"10.3.0\"" Jun 16 16:47:38 dead ollama[7273]: time=2024-06-16T16:47:38.214-07:00 level=WARN source=sched.go:511 msg="gpu VRAM usage didn't recover within timeout" seconds=5.545797232 ``` But somehow: `amnesia λ ~/ sudo ROCR_VISIBLE_DEVICES=0 HSA_OVERRIDE_GFX_VERSION="10.3.0" OLLAMA_DEBUG=1 ollama serve` Works fine and I can chat without issue. Here's my service file, please note I have tried with both the ollama user and the root user (and the ollama user is properly configured/in render & video group): ``` [Unit] Description=Ollama Service After=network-online.target [Service] ExecStart=/usr/local/bin/ollama serve User=root Group=root Restart=always RestartSec=3 Environment="PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin:/usr/local/lib/baresip/modules" Environment="ROCR_VISIBLE_DEVICES=0" Environment="HSA_OVERRIDE_GFX_VERSION=\"10.3.0\"" [Install] WantedBy=default.target ``` Both in the shell & run as a service they report using the same GPU (id=0, 6700XT): `level=INFO source=amd_linux.go:71 msg="inference compute" id=0 library=rocm compute=gfx1031 driver=0.0 name=1002:73df total="12.0 GiB" available="12.0 GiB"` _Originally posted by @pulpocaminante in https://github.com/ollama/ollama/issues/4427#issuecomment-2171941948_
GiteaMirror added the amdbug labels 2026-04-22 07:33:27 -05:00
Author
Owner

@ghost commented on GitHub (Jun 18, 2024):

Changing:

Environment="HSA_OVERRIDE_GFX_VERSION=\"10.3.0\""

To:

Environment="HSA_OVERRIDE_GFX_VERSION="10.3.0""

Fixes the issue.

<!-- gh-comment-id:2176767016 --> @ghost commented on GitHub (Jun 18, 2024): Changing: Environment="HSA_OVERRIDE_GFX_VERSION=\"10.3.0\"" To: Environment="HSA_OVERRIDE_GFX_VERSION="10.3.0"" Fixes the issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#28971